The Annals of Statistics 

2006, Vol. 34, No. 6, 2707-2756 

DOI: 10.1214/009053606000000731 

© Institute of Mathematical Statistics, 2006 

SEMIPARAMETRICALLY EFFICIENT RANK-BASED 
INFERENCE FOR SHAPE 
I. OPTIMAL RANK-BASED TESTS FOR SPHERICITY 

By Marc Hallin and Davy Paindaveine^ 

Universite Libre de Bruxelles 

We propose a class of rank-based procedures for testing that the 
shape matrix V of an elliptical distribution (with unspecified cen- 
ter of symmetry, scale and radial density) has some fixed value Vq; 
this includes, for Vq = 1^ , the problem of testing for sphericity as 
an important particular case. The proposed tests are invariant under 
translations, monotone radial transformations, rotations and reflec- 
tions with respect to the estimated center of symmetry. They are valid 
without any moment assumption. For adequately chosen scores, they 
are locally asymptotically maximin (in the Le Cam sense) at given 
radial densities. They are strictly distribution-free when the center 
of symmetry is specified, and asymptotically so when it must be es- 
timated. The multivariate ranks used throughout are those of the 
distances — in the metric associated with the null value Vo of the 
shape matrix — between the observations and the (estimated) center 
of the distribution. Local powers (against elliptical alternatives) and 
asymptotic relative efficiencies (ARBs) are derived with respect to the 
adjusted Mauchly test (a modified version of the Gaussian likelihood 
ratio procedure proposed by Muirhead and Waternaux [Biometrika 
67 (1980) 31-43]) or, equivalently, with respect to (an extension of) 
the test for sphericity introduced by John [Biometrika 58 (1971) 169- 
174]. For Gaussian scores, these AREs are uniformly larger than one, 
irrespective of the actual radial density. Necessary and/or sufficient 
conditions for consistency under nonlocal, possibly nonelliptical al- 
ternatives are given. Finite sample performances are investigated via 
a Monte Carlo study. 
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1. Introduction. 

1.1. The hypothesis of sphericity. The distribution of a fc-dimensional 
random vector X is called spherical if for some G M'^, the distribution 
of X — is invariant under orthogonal transformations. For multinomial 
variables, sphericity is equivalent to the covariance matrix S of X being 
proportional to the identity matrix 1^. Under the assumption of ellipticity, 
finite second order moments need not exist and sphericity is equivalent to 
the shape matrix V being equal to the unit matrix 1^; see Section 1.2 for 
precise definitions. Under more general nonelliptical distributions, however, 
this equivalence no longer holds: V = I^, (the hypothesis of unit shape) does 
not imply sphericity, nor even that the directions U := (X — 0)/||X — 6\\ 
be uniform over the unit sphere (the hypothesis of isotropy), and S (when 
finite) and V no longer coincide up to a positive scalar factor. The hypothesis 
of sphericity thus is a strict subhypothesis of the hypothesis of isotropy, itself 
a strict subhypothesis of the unit shape hypothesis. While isotropy and unit 
shape only deal with the U's, that is, with the directional features of X — 0, 
sphericity also imposes a strong symmetry structure on the moduli ||X — 
This symmetry plays a crucial role in the approach we are adopting here 
and the null hypothesis of interest throughout is the hypothesis of spherical 
symmetry rather than that of unit shape; a detailed discussion of this issue 
is provided in Section 5. 

Sphericity assumptions play a key role in a number of statistical problems, 
although the distinction between sphericity, isotropy, unit shape and a co- 
variance matrix proportional to Ik is far from clear in the literature. Indeed, 
additional assumptions (Gaussian or elliptical densities, finite second-order 
moments, etc.) — the necessity of which is all too rarely questioned — in gen- 
eral cause the aforementioned assumptions to coincide. Besides this role as 
a technical assumption underlying some of the most frequently used statisti- 
cal methods, null hypotheses of the type V = Vq — which (depending on the 
assumptions) reduce to the hypotheses of sphericity, isotropy or unit shape 

1/2 

by substituting Vg "'"(X - 6) for (X - 6) — are also of direct interest in 
several specific domains of application such as geostatistics, paleomagnetic 
studies in geology, animal navigation, astronomy and wind direction data; 
see [5, 34, 53] or [35] for references. Finally, shape matrices provide robust 
alternatives to traditional covariance matrices; as such, they are obvious 
candidates for serving as the basic tools in a host of multivariate analysis 
techniques. Null hypotheses of the form V = Vq, in their various guises (re- 
ducing to sphericity, isotropy or unit shape) are thus of interest in their own 
right. 

Because of its importance for applications, the problem of testing the hy- 
pothesis of sphericity has a long history and has generated a considerable 
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body of literature which we only very briefly summarize here. For normal 
populations, the asymptotic theory has been thoroughly investigated. The 
Gaussian likelihood ratio test was introduced by Mauchly [36] and belongs 
to the classical toolkit of multivariate analysis. The (Gaussian) locally most 
powerful invariant (under shift, scale and orthogonal transformations) test 
was obtained by John [24, 25] and by Sugiura [48]. In their original ver- 
sions, these tests are valid under Gaussian assumptions only; however, with 
slight modifications, they remain valid under elliptical populations with fi- 
nite fourth-order moments; see Section 5.3 of [38] for the adjusted Mauchly 
test and Section 3.3 of the present paper for an adjusted version of John's 
test. Without elliptical symmetry, however, these adjusted tests are no longer 
valid; therefore, they qualify as tests of sphericity, not as tests of isotropy or 
unit shape. Moreover, it has been shown (see [23]) that they behave rather 
badly under heavy tails (a fact that is confirmed by the Monte Carlo study in 
Section 6). Although they still require elliptical symmetry and finite fourth- 
order radial moments, a more robust behavior can be expected from the 
test statistics introduced by Tyler [50], who proposes replacing covariance 
matrices with more robust estimators of scatter. 

Non-Gaussian models have been investigated by Kariya and Eaton [26], 
where elliptical densities, possibly with infinite variances, are considered. 
Uniformly most powerful unbiased tests are derived, basically against spec- 
ified nonspherical shape values. The results of that paper do not allow for 
more general optimality concepts (such as maximinity or stringency) involv- 
ing unspecified shape alternatives. Despite their obvious theoretical value, 
such tests thus have limited practical value. 

As a reaction to Gaussian and other strong distributional assumptions, 
nonparametric tests of sphericity have also been constructed. Their main 
advantage is that they remain consistent against all possible nonspherical 
alternatives, including the nonelliptical ones. The drawback is that they 
are computationally heavy and only achieve slow nonparametric consistency 
rates. Examples include Beran [6] and Koltchinskii and Sakhanenko [28] for 
the null hypothesis of ellipticity and Baringhaus [5] for sphericity. Another 
way of escaping Gaussian or fourth-order moment assumptions involves bas- 
ing the tests on statistics that are measurable with respect to invariant or 
distribution-free quantities such as the multivariate concepts of signs and 
ranks developed, mainly, in the robustness literature; see [39] for a review. 

This sign-and/or-rank-based approach has been adopted by Tyler [53], 
Ghosh and Sengupta [13] and Marden and Gao [33]. Tyler [53] addresses 
the problem of testing uniformity over the sphere for directional data and 
proposes a sign test related to his celebrated [52] estimator of shape. In 
a slightly different context, Ghosh and Sengupta [13] also propose a test 
entirely based on multivariate signs, that is, on cosines of the form U^Uj = 
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(Xj — 9y(X.j — 0)/||Xj — 0||||Xj — 6\\, where Xj, i = 1, . . . ,n, denote the k- 
dimensional observations. These two multivariate sign tests are of a heuristic 
nature and do not rely on any clear optimality concerns. Their advantages 
and disadvantages are those which are usually associated with sign tests: 
they remain valid under a broad class of densities and are consistent against 
a broad class of alternatives, none of which requires elliptical symmetry. As 
a test of sphericity, the Ghosh and Sengupta test is not consistent against 
nonspherical alternatives with unit shape matrix. Therefore, rather than a 
test of sphericity, it should be considered a test of isotropy, or even a test of 
the hypothesis of unit shape with isotropic fourth-order directional moments; 
see Section 5 for details and an extension to the null hypothesis of unit 
shape. If ellipticity is assumed (so sphericity becomes the null hypothesis of 
interest), however, restricting to signs leads to a substantial loss of efficiency 
since the distances di := ||Xj — 6\\, which are not taken into account, then 
also carry much relevant information. 

In Marden and Gao [33] , a variety of structural hypotheses on covariance 
matrices are considered, including sphericity and unit shape. Appropriate 
multivariate sign- and rank-based competitors of the Gaussian likelihood 
procedures are proposed. The ranks used by the authors are the spatial 
ranks introduced by Mottonen and Oja [37] and Chaudhuri [9]; see also [32]. 
Although Pitman efficiencies (with respect to the Gaussian methods) are 
obtained, no attempt is made to achieve any optimality and the authors 
restrict themselves to procedures of the Wilcoxon and sign test types; even 
so, they show that the sign-and-rank (Wilcoxon) procedures perform much 
better than those based on the signs alone, a finding that will be confirmed 
(both qualitatively and quantitatively) by the form of the information ma- 
trices we will derive in Section 2. 

The approach we are adopting in the present paper is in the same spirit. 
However, throughout, we combine robustness (distribution-freeness under 
sphericity, without any moment assumptions) and optimality concerns. Our 
tests are based on multivariate signs and the ranks of the norms of the ob- 
servations centered at 6 (or an estimate 0), with test statistics that have 
a structure similar to that of John's. According to whether the center of 
symmetry is specified or not, these statistics are strictly distribution- 
free under sphericity, or asymptotically so. With adequate scores, they are 
asymptotically optimal (in the Le Cam sense) against nonspherical elliptical 
distributions at chosen radial densities. In the elliptical setup, asymptotic 
relative efficiencies (AREs) with respect to the adjusted John and Mauchly 
tests are derived and appear to be surprisingly high (particularly for the 
van der Waerden version). Actually, Paindaveine [43] shows that the cel- 
ebrated Chernoff and Savage [10] result concerning AREs of normal score 
tests for location with respect to Student's extends to the present situation: 
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the AREs of the normal score versions of our tests with respect to the tradi- 
tional John-Mauchly-Muirhead-Waternaux tests are uniformly larger than 
one, irrespective of the underlying radial density. 

The optimality properties of our tests are related to local elliptical alter- 
natives; however, it is shown in Section 5.2 that provided nonconstant score 
functions are used (the "constant-score case" corresponds to the extended 
sign test proposed in Section 5.1), our tests nevertheless remain consistent 
against most elliptical as well as nonelliptical alternatives (some nonellipti- 
cal simulation results are reported in Section 6). We refer to Section 5 for a 
discussion of this matter. 

Some basic reasons for considering sphericity as an alternative to classical 
Gaussian assumptions have been discussed above. Another non-Gaussian 
extension of the assumption of Gaussian sphericity is the assumption of i.i.d.- 
ness, under which the k components of the observed X are independent and 
identically distributed (i.i.d.), with common unspecified symmetric marginal 
density /. Under Gaussian marginals, i.i.d.-ness and sphericity coincide, but 
not under general densities. In fact, Kac [27] shows that this hypothesis of 
i.i.d.-ness is rotation-invariant only on the class of multivariate Gaussian 
distributions. If Gaussian assumptions are abandoned, this hypothesis is 
no longer rotation-invariant and becomes strongly coordinate-dependent. 
Therefore, it does not fit into the semiparametric, coordinate-free setting we 
are adopting here. 

However, the same assumption of i.i.d.-ness implies unit shape. The null 
hypothesis of i.i.d.-ness is thus strictly included in that of unit shape. There- 
fore, one might consider testing i.i.d.-ness by performing a test of unit shape 
such as the extended sign test proposed in Section 5.1. Although valid from 
the point of view of type I risk, such a test, for the null hypothesis of i.i.d.- 
ness, would be severely biased and inconsistent. Indeed the Maxwell-Hershell 
theorem (see, e.g., pages 51-52 of [8]) indicates that all non-Gaussian spher- 
ical distributions are part of the alternative, while our Proposition 5.1(v) 
establishes that a-level extended sign tests at spherical alternatives have 
asymptotic power a. For all of these reasons, it seems that i.i.d.-ness, in 
this context, is not the appropriate generalization of traditional Gaussian 
assumptions. 

1.2. Elliptical densities: location, scale, shape and radial density. Denote 
by XW := (Xj"^', . . . , x!a^'y, n G N, a triangular array of fc-dimensional ob- 
servations. Throughout, Xj^"\ . . . ,xi"^ are assumed to be i.i.d., with ellipti- 
cal density 
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where G M}' is a location parameter S Mq" := (0,oo) a scale parameter^ 
and V := (V^j), a symmetric positive definite real x /c matrix with V\\ = 1, 
a shape parameter. The infinite-dimensional parameter /iiMq" — > R"*" := 
[0, oo) is an a.e. strictly positive function, the constant Ckji a normalization 
factor depending on the dimension k and fi. 

The function fi will be called, conveniently but improperly, a radial den- 
sity (/i does not integrate to one, and is therefore not a probability density). 
Denote by d^-"^ = d[^\6,\') := \\Z^-^\6 ,\')\\ the modulus of the centered and 
sphericized observations Z^"^ = z[^\o,'V) := V~^/^(X^"^ — 0), i = 1, . . . ,n. 

If the Xj-"^'s have density (1.1), these moduli are i.i.d., with density and 
distribution functions 

r^-fik(-\-= f-) /if-V[^>o] 

and 

r ^ Fik{r / a) := I fik{s)ds, 
Jo 

respectively, provided, however, that 

;>oo 

(1.2) fik^i.j, := / r'=-Vi(r)dr<oo, 

JO 

an assumption we shall henceforth always make on /i. This function fi^ 
is the actual radial density and (1.2) thus merely ensures that it will be a 
probability density function; in particular, it does not imply any moment 
restriction on /ifc, the d^"^'s nor the X-"^'s. Any square root V^/^ of V [sat- 
isfying V^/^(V^/^)' = V] can be used in the above definitions, provided, of 
course, it is used in a consistent way. For the sake of simplicity, however, A^/'^ 
throughout stands for the symmetric root of any symmetric positive semi- 
definite matrix A, thus avoiding superfluous "prime" notation. 

Now, if a and fi (or, more precisely, a and Ckjjfi) are to be identifiable, 
a scale constraint is required. Still endeavoring to avoid moment restrictions, 
we impose the condition that the (ij^^'s, under (1.1), have common median 
a, that is, 

(1.3) Fifc(l) = l/2 or, equivalently, {^.k-i-j^T^ C r^'^ h{r) dr = 1/2. 

Jo 

When this is to be emphasized, we call /i a standardized radial density. 
Special cases are: 

(a) the fc-variate multinormal distribution, with radial density f\{r) = 
(/)i(r) :=exp(-afer2/2); 
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(b) the /c-variate Student distributions, with radial densities (for z/ de- 
grees of freedom) /i(r) = fi^y{r) := (1 + a^.i/^^/z^)"^'^"'''''*^^; 

(c) the fc-variate power-exponential distributions, with radial densities of 
the form /i(r) = /^^(r) := exp(-6fc,r;?^^'')- 

[The constants > 0, ak,u > and bk^n > are such that (1.3) is satisfied; 
note that = 2bk,i = limi^^oo 

Writing vechM := (Mn, (vechM)')' for the k{k + l)/2-dimensional vec- 
tor obtained by stacking the upper-triangular elements of a A; x A; symmet- 
ric matrix M = (Mij), we denote by P|^j^ or v/^ the distribution of 

X^") under given values of i9 = {6' , cr^, (vech V)')' and /i [fi satisfying (1.2) 
and (1.3)]. The parameter space is thus := M'^ x Mq' x Vk, where Vfc ei- 
ther stands for the set of all A; x A: symmetric positive definite matrices V 
such that Vii = 1 or for the corresponding set (in m('^(*^+i)/2)~1) of values of 
vech V. 

The notation = izj"^ {6, V) will be used for the rank of ^"^ = df"^ {0, V) 
among d^^\ . . . , ; under P^j^, the vector {R^^\ . . . , rIt^^) is uniformly 
distributed over the n! permutations of (1, . . . ,n). Let Uj-"^ = \j\"'\9,Y) := 
Z^""^ /d["'\ The vectors uj"^ under P|^j^ are i.i.d. and uniformly distributed 

(n) 

over the unit sphere. They are independent of the ranks and are usu- 
ally considered as multivariate signs associated with the centered observa- 
tions (Xj — 6) since they are totally insensitive to transformations of (Xj — 6) 
that preserve half-lines through the origin. 

The definition of the shape parameter V under elliptic symmetry readily 
follows from the special form of the density (1.1). A more general definition, 
which remains valid under possibly nonelliptical symmetric distributions, 
has been given by Tyler [52], where V is defined as the unique symmetric 
positive definite matrix such that Vn = 1 (Tyler actually uses the normal- 
ization tr V = A;) and 

E[(x - 6')(x - 6>)7(x - 6»)'v-i(x - e)] = 

k 

(where 6 denotes the center of symmetry). The sample Tyler matrix 
then provides a universally root-n consistent estimator of V. This ingenious 
extension may be somewhat misleading, however, as this "shape," in the 
absence of ellipticity, has a much weaker, and purely directional, interpre- 
tation. In particular, it is no longer associated with any family of contours 
and, under finite second-order moments, it loses its relation to covariance 
matrices — hence much of its intuitive content. 
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1.3. Outline of the paper. The problem we are considering is that of 
testing the hypothesis that the shape matrix V is equal to some given value 
Vq (admissible for a shape parameter). The special case Vq = Ifc, where 1^ 
stands for the /c-dimensional identity matrix, yields the problem of testing for 
sphericity. In the notation of the previous section, the shape matrix V in this 
problem is thus the parameter of interest, while 6, and /i play the role of 
nuisance parameters. Hence, it is highly desirable that the null distributions 
of the test statistics to be used remain invariant under variations of 0, 
and f\. 

When is specified, we achieve this objective by basing our tests on the 
signs U^*^^ and ranks computed from Zj-"^(0,Vo), i = l,...,n. These 
tests are invariant under monotone radial transformations (including scale 
transformations) , rotations and reflections of the observations (with respect 
to 0), hence distribution- free with respect to such transformations. When Q 
is unspecified, the ranks and signs are to be computed from Z."^(0,Vo), 

- (n) 

i = 1, . . . ,n, where 6 = is an arbitrary root-n consistent estimator of 
the location parameter 6] however, for 6, we recommend the (rotation- 
equivariant) spatial median of Mottonen and Oja [37] which is itself "sign- 
based." This issue is treated in Section 4.4. 

(n) 

The tests (j) based on these multivariate signed-rank statistics, whether 

ranks and signs are computed from 6 or from 0, are locally asymptotically 
optimal (actually, locally asymptotically maximin- efficient, as the nonspec- 
ification of the scale a induces a strict loss of efficiency) in the Le Cam 
sense, under adequately chosen score functions. The test statistics take the 
very simple form (dropping superfluous superscripts, c being some positive 
constant and K a score function, see Section 4.2 for details) 

(1.4) QK = c{iTS\-^t?^K^ withS/^:=iX^J^(^-^)UiU^, 

to be compared with the Gaussian statistic of John [24] [d is some positive 
constant; see Section 3.3 for details), 

1=1 i=l 

(1.5) 

The special case of a constant score [K{u) = 1, < ?i < 1] yields S5 := 
^Y^f^iUiV^ and a test (pg^^ which is essentially that proposed by Ghosh 

and Sengupta [13]. 

The rest of the paper is organized as follows. In Section 2 we establish 
the local asymptotic normality result (with respect to the location, scale 
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and shape parameters) that provides the main theoretical tool of the paper. 
This result allows the development of asymptotically optimal parametric 
procedures for V under specified values of /i and a (with possibly unspec- 
ified 6). This is explained in detail in Section 3 where we also derive the 
asymptotically optimal (efficient, at given /i) "cr-free" tests for hypotheses 
of the form V = Vq (tests for sphericity being a special case) and explic- 
itly compute the loss (in local powers) due to the nonspecification of scale. 
The Gaussian version of this test is investigated further and its link with 
some classical tests of sphericity is discussed. In Section 4 we propose non- 
parametric (signed-rank-based) versions of the optimal procedures defined 
in Section 3 and study their invariance and asymptotic properties. Asymp- 
totic relative efficiencies with respect to the parametric Gaussian tests are 
derived in the elliptical case. All of these results are obtained under spec- 
ified first; then, in Section 4.4 we show that under minimal regularity 
assumptions on the actual underlying density (essentially, those ensuring 

- (n) 

ULAN), 6 can safely be replaced by any root-n consistent estimator 6 
In Section 5, we study the validity (under extensions of the null hypothe- 
sis of sphericity) and consistency properties under nonlocal alternatives of 
our testing procedures. An adjusted version of the sign test is proposed, ex- 

(n) 

tending the validity oi (f) ^ to the null hypothesis of unit shape. Necessary 

and/or sufficient conditions for consistency are established. For Wilcoxon 
(i.e., linear) scores, these necessary and sufficient conditions take a very 
simple form which shows that the corresponding rank tests are consistent 
against essentially all nonspherical alternatives, including the nonelliptical 
ones. As for the (adjusted) sign test (t>'^g \ it is shown to be consistent 

against all non-unit-shape alternatives, confirming its qualification as a fully 
consistent test for unit shape. Section 6 provides some simulation results 
which indicate that finite-sample performances reflect the asymptotic pow- 
ers derived in the previous sections, as well as the nonelliptical consistency 
property established in Section 5. The Appendix compiles some technical 
proofs. 

1.4. Notation. The following notation will be used throughout. Denoting 
by Gi the ^th vector in the canonical basis of M'^ and by 1^ the k x k unit 
matrix, let 

k k 

Kfc •= X! (eiej)®(eje-) and Jfc := ^ (eje^) (g) (e^ej) = (veclfc)(veclfc)'; 

i,j=l i,j=l 

the k"^ X k^ matrix is known as the commutation matrix. With this nota- 
tion, Kfcvec(A) = vec(A') and Jfcvec(A) = (tr A)(veclfc). Note that {l/k)Jk 
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and := 1^2 — {l/k)3k are the matrices of the projections onto the mutuahy 
orthogonal subspaces {A(vecIfc)|A G M} and {vec(A)| tr A = 0}, respectively. 
Define as the {k{k + l)/2 — 1) x /c^ matrix such that M'^, (vec h (v) ) = 
vec(v) for any symmetric k x k matrix v = (vij) with vn = 0, and let be 

the {k{k + l)/2 — 1) x k'^ real matrix such that Nfc(vecv) = vechv for any 
symmetric k x k matrix v. Finally, we write V*^^ for the Kronecker product 
VO V. 

2. Uniform local asymptotic normality (ULAN). Our objective is to per- 
form inference on the shape parameter V under unspecified location 6, un- 
specified scale a and unspecified standardized radial density /i: V is thus 
the parameter of interest, whereas 6, a"^ and /i play the role of nuisance 
parameters. The relevant statistical experiment involves the nonparametric 
family 

pW:= U ^}^= U U^S/. 

(2.1) 

■■= U U{<i^v;/J^eM^VeV.} 

/ieJ^A'^>o 

[/i ranges over the set of standardized densities satisfying Assumptions 
(Al) and (A2) below], in which the partition of "P^") into a collection of para- 

(n) 

metric subexperiments 'P^2.f^ all indexed by the same parameters 9 and V, 
induces a semiparametric structure. The main technical tool is the uniform 
local asymptotic normality (ULAN), with respect to i9 = (0',(j^, (vech V)')', 
of the famihes vf^. This LAN (ULAN) issue has been briefly touched by 
Bickel (Example 4 in [7] ) . The very particular case of bivariate distributions 
with flnite second-order moments has been treated recently by Falk [11] in 
his investigation of the inefficiency of empirical correlation coefficients. 

In order to describe the extremely mild assumptions under which the 
families 'Pj-"^ are ULAN, we introduce the following definitions. Consider 
the measure space (0,]Bq,A), where A is some measure on the open subset 

C M equipped with its Borel cr-field Bq. Denote by L?'{Q.,X) the space 
of measurable functions /i:^ ^ M satisfying j^[h{xy\^ d\{x) < oo. In par- 
ticular, consider the space L2(M+,/i^) [resp. L\R,ue)] of square integrable 
functions w.r.t. the Lebesgue measure with weight on M.q (resp. with 
weight e^^ on M), that is, the space of measurable functions h : M.q —>■ M satis- 
fying J^[h{x)]'^x^dx < oo (resp. /i : M ^ M satisfying J^^[h{x)]'^e^'-^ dx < oo). 
Recall that g € L^(J7,A) admits a weak derivative T iff ^^g{x)ip' {x) dx = 
— ^^T{x){p{x) dx for all infinitely differentiable (in the classical sense) com- 
pactly supported functions on fi. The mapping T is also called the deriva- 
tive of g in the sense of distributions in L^(r2,A). If, moreover, T itself is 
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in L^(r2,A), then g belongs to VF^'^(r2,A), the Sobolev space of order 1 on 
L'^{Q, A). For the sake of simphcity, we will write L^(J7) and W^''^{Q) when A 

(n) 

is the Lebesgue measure on Q. The family Vj^ is ULAN under the following 
assumptions on the radial density /i: 

Assumption (A1). The mapping x ^ fl^'^{x) is in W^''^{Wq , fj,k~i)- 

Letting ^Pf^ir) := — 2(/|^^)'(r)//|^^(r), where {fl^'^)' stands for the weak 
derivative of fl^'^ in L^(Mq', Assumption (Al) ensures the finiteness 
of radial Fisher information for location (expectation is taken under P|^l ), 



Assumption (A2). The mapping x i-^ /i;{xp(^) •= fi^'^i^^) is 



Letting ^f^{r) := -2r i(/i.(^p)'(lnr)//i.£p(lnr), where (/lYe^p)' stands 



In principle, the functions (pj-^ and ipf^ differ. However, they do coincide 
(a.e.) under the following Assumption (Al-2), which, though slightly more 
stringent than Assumptions (Al) and (A2), holds for most densities consid- 
ered in practice: 

Assumption (Al-2). The radial density fi is absolutely continuous 
with a.e.-derivative /i and, letting 93^^ = V'/i '■= the integrals 



I,if^)■.= E[^lid^\e,V)/a)]= [ ^l{F-^\u))du. 

<j 




J^{f{)-.= YM>)Mt\e,N)la){dt\ey)laf\= [ Kl{u)du. 

^ 




and 




are finite. 



It should be stressed that none of these assumptions requires the existence 
of any moment for the radial density /i^. They are satisfied, for instance, for 
all multivariate Student radial densities, including the Cauchy ones. For the 
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radial density /* ^ of the A;-variate t-distribution with v degrees of freedom 
\u G (0, oo)], it can be checked that 

k{k + i/) ^/.t ^ k{k + 2){k + u) 



(2.2) Ufi.) 



and JkiflJ 



k + v + 2 -^-y-'i,u^ k + v + 2 

The same remark holds for the power-exponential distributions, provided 
that A; > 2 (which is not a limitation, since the problem under consideration 
is void for /c = 1), with 



(2.3) Jfc(/i%)=V6, 



2, r((4r? + A: - 2)/2r?) 
T{k/2ri) 



and Jk{fl^) = k{k + 2T]) 



(r denotes the Euler Gamma function). The corresponding values for k- 
variate multinormal distributions can be obtained by taking limits of the 
information quantities in (2.2) as ^ (X) or, equivalently, by evaluating (2.3) 
at rj = 1: 

Ik{4'i)=akk and Jk{(j)i) = k{k + 2). 

Note that limi^^o ^fe(/i,i/) = li™»y-+o c7fc(/i^.;y) = k'^, which is a sharp lower 
bound for radial shape/scale information since, by Jensen's inequality and 
integration by parts, 



(2.4) mh)f/^> Kf,{u)du= ^l^f,{F-^^{u))F{,\u)du = k. 



Similarly, assuming that the density in (1.1) has finite second-order mo- 
ments, the radial information for location Xfc(/i) satisfies (the Cauchy- 
Schwarz inequality) 

lk{h)>k'^j\F~^\u)f du 
with equality in the multinormal case only. 

Proposition 2.1. Under Assumptions (Al) and (A2), the family V^^^ := 



{p|^jji9 € 0} is ULAN, with [writing di andXJi, resp., for d\'^' (OjV) and 
uf V.V)] central sequence 



(«)/ 



(2.5) 



n 



a 



i=l 



/ a-2(veclfc)' 

lvMfc(V®2)-l/2 



Evec(V/.(^)^U.U^-I. 



a / a 
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and full-rank information matrix 

/Tf,.um 

(2.6) Tf,i'd):=\ Tf,,22W r'^,;32W 

V r;,;32W r^i;33W, 



where 



r/i;iiW:=^^fc(/i)V-\ 
r/,;22W:=^(Jfc(/i)-A;2), 



and 



r/,;33W:=^Mfe(V^2)-l/2 

(2.7) 

X (V^2)-1/2m/, 



Jkifl) _|_ _|_ J^j _ 



fc(fc + 2) 



More precisely, for any = (0^")', cj^H, (vech V^"))')' = i? + 0(n"V2) 
anc? any hounded sequence r^") := (t^")', s^"), (vech v^"))')' = (tj^"^', T2"\ Tg"^')' 
^fc+fc(fc+i)/2^ w;e /laue 

= (r("))'A^';)(i9(")) - i(T("))'r/,(i9)rW + op(l) 

and 

A(';)(^("))^AA(o,r;,(^)) 

under P^I^L , , as n—> oo. 
■&''"■> -.fi 

Proof. See Appendix (Section A.l). □ 

Note that the structure of the information matrix for shape (2.7) is not 
unfamihar, having been previously obtained under much more restrictive 
assumptions; see, for example, page 219 of [8]. 



3. Parametrically efficient tests for shape. 
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3.1. An efficient central sequence for shape. The block-diagonal struc- 
ture of the information matrix (2.6) and ULAN imply that substituting 
a (in principle, discretized — see, e.g., [30], page 125) root-re consistent es- 

- (n) 

timator = for the unknown has no influence, asymptotically, on 
the (cT^, V)-part of the central sequence Aj"^('j9). Optimal inference about 
((T^,V) can thus be based, without any loss of (asymptotic) efficiency, on 

(Aj"!'2(^, cr^, V), A^^"^3(0, cr^, V))' as if were the actual location parameter; 
see Section 4.4 for details. Therefore, in this section, we assume throughout 
that is known. Similarly, replacing o"^ and V with root-?i consistent esti- 
mators o"^^"^ and V^'") in the 0-part of the central sequence Aj"^(i9) has no 
impact, asymptotically, on inference about 0. 

Unlike the asymptotic covariances between the location and scatter com- 
ponents of the central sequence Aj"^(i9), the asymptotic covariances between 

the (T^-part Aj"?2(''^) ^'^'^ the V-part Aj"?3(i9) are not zero. This means that 

a local perturbation of scale has the same asymptotic impact on Aj".3(i9) as 
a local perturbation of V. It follows that the cost of not knowing the actual 
value of o"^ is strictly positive when performing inference on V. Since it is 
hard to think of any practical problem where the scale (but not the shape) 
is specified, we concentrate on optimality under unspecified scale cr^ and 
explicitly compute the information loss due to the presence of this nuisance. 

LAN and the convergence of local experiments to the Gaussian shift ex- 
periment 

Asj Ur;,,32(^) r^,;33(^)A^3 

^ lr^.;32(^) r;,;33(^) 



(r2,r'3)'e 



-l)/2 



imply that locally optimal inference on shape, in the presence of an un- 
specified scale parameter, should be based on the residual of the regression 
[in (3.1)] of A3 with respect to A2, computed at Aj"?3(i9) (the shape part of 

the central sequence) and Aj'^.''r,(i?) (the scale part of the same). This resid- 
ual takes the form A3 — T f^.'^2{'^)^'f^-22i'^)^'ii'^)'i resulting fi-efficient 
central sequence for shape is thus 

a;;")(^) = Ag3(^) - r^,;32(^)r7^l22 WA^:!2 

which, after some elementary algebra, reduces to 

A;;")(^) = '-n-y'M,{V^'r'^'jt±^,, (^) ^ vec(U.UD. 
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(n) 

This efficient central sequence under is asymptotically normal, with 
mean zero and covariance (the efficient Fisher information for shape under 
radial density /i) given by 

r}^(^) = r;,;33W - r;,;32(^)r7^^22 Wr/i;32(^)- 

After some routine computation, this efficient information takes the form 



iMfc(V®2) 



Jk{fl 



(3.2) 



-1/2- 



(Ifc2 + Kfc + Jfc) - 3k 



k{k + 2) 
J-fc(/i)T^HV), 



Ifc2 + 



-Jk 



(V^2)-l/2]yj, 



a form that is not unfamiliar in the area of robust estimation of covariance 
matrices; see, for instance, the asymptotic covariances in [40, 42, 50, 51] for 
the covariances of scatter estimates [as in (2.6), (2.7)], [41, 52] for covariances 
of shape estimates [as in (3.2)]. 

In the sequel, optimality (in the local and asymptotic sense, at radial den- 
sity /i) is to be understood in the context of the Gaussian shift experiments 

associated with the efficient central sequences Aj|"^(i?). In particular, a se- 
quence of tests will be called locally and asymptotically maximin- efficient 
(at asymptotic level a) if it is asymptotically maximin in the sequence of 
experiments associated with 

3.2. Optimal parametric tests for shape. Consider the problem of testing 
a null hypothesis of the form 7io : V = Vq in the parametric model where /i 
is known and the scale cr^ is unspecified. Optimality (in a local and asymp- 
totic sense — see Proposition 3.1 for a precise statement) is reached by tests 
based on quadratic forms in the /i-efficient central sequence for shape. More 
precisely, the optimal test statistics take the form 



Qh 



(A;|")(^o))'(r:^,(^o))''A;;"^(^o 



-1 A *(n). 



where, denoting by a a root-n consistent estimator for a, we let i9o '■= 
(0', o"^, (vech Vq)')'- Note that consistent estimation of a under the family 
U/i U(t>o{-P0"ct2 Vo'/i^ easily achieved since a is then the common median 
of the distances di(0, Vq). As we shall see in Section 3.3, the Gaussian ver- 
sion of these optimal parametric tests allows the bypassing of this estimation 
of a. 
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Lemma 3.1 below leads to the more explicit form 



_ k{k + 2) didj 



with di ■=dP{e,\o) and U, :=ufV,Vo) 



Lemma 3.1. Denote by e^,2 i the first vector of the canonical basis of 
M'^^. Then if'V = (Vij) is symmetric with Vu = 1, we have 



-M'fcTfc(V)M, 



(3.3) 



k{k + 2)' 

= [1,2 + Kfc](V«2) _ 2iY^^)e,2^,ivecVy 
-2(vecV)(efe2^i)'(V®2) + 2(vecV)(vecV)'. 



Proof. See Appendix (Section A. 2). □ 

Proposition 3.1. Let fi satisfy Assumptions (Al) and (A2). Then, 
denoting by ||A|| := [tr(AA')]^/^ the Frobenius norm of K, 

(i) Qj"^ is asymptotically chi-square with k{k + l)/2 — 1 degrees of free- 
dom under [J^2{P^Q^^2 ^^.fA and asymptotically noncentral chi-square, still 
with k{k -\- 1)/2 — 1 degrees of freedom but with noncentrality parameter 

Jkifi] 



2k{k + 2) 



tr((VoV 



JkUi) 

2k{k + 2) 



1,^2 



(trVoV 



trVn V 



under {^.^{^''el^Y.+n-^n.-j^' 

ill) the sequence of tests which consists of rejecting Tio : V = Vq 
soon as Q^J^ exceeds the a upper- quantile of a chi-square variable with k{k-\- 

l)/2 — 1 degrees of freedom, has asymptotic level a under Uo-^lPg^CT^ Vo /iJ" 
and is locally and asymptotically maximin- efficient, still at asymptotic level a, 
/or U<^2{P^"^2^Vo;fi J' '^gainst alternatives of the form [j^2 \Jv^Wo{^t^a^y-fA- 

Proof. See Appendix (Section A. 2). □ 

In contrast with this unspecified-cr^ test, the locally and asymptotically 
optimal procedure for testing TIq : V = Vq under specified radial density fi , 



as 
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specified 6 and specified scale cr^ rejects (at asymptotic level a) whenever 
(3.4) 

:= (a(';)3(0, a\ Vo))'(r;,;33(0, a\\,))-^^f.^,{e, a\\,) 

exceeds the a upper-quantile of a chi-square with k{k + l)/2 — 1 degrees of 
freedom. The efficiency loss due to an unspecified o"^ can thus be measured 
by the difference between the noncentrality parameters in the asymptotic 
chi-square distributions of Q0-2 jj and Qj-^ under local alternatives. Along the 
same lines as the proof of Proposition 3.1, one can show that this difference, 

under Pj^^^^Vo+n-V^v;/!' 

(3.5) -^mh)-k'){iiY^\f. 

Inequality (2.4) confirms the unsurprising fact that this loss is nonnegative 
and an increasing function of the information for shape (or scale) Jk{fi)- 
Quite remarkably, it does not depend on the scale cj^ itself. Also, note that 
the loss is nil against local alternatives such that tr Vq = 0. When testing 
for sphericity (Vq = Ife), this reduces to tr v = 0; in particular, there is no 
loss in the case va = for all i = 2, . . . ,k. 

Further investigation of (3.5) reveals some interesting facts concerning 
the relation between this loss and the tails of underlying radial densities. 
Assume, for the sake of simplicity, that Vq = Ifc and consider the "elemen- 
tary diagonal deviations from sphericity" associated with v = Ae^e^ for some 
i = 2, . . . ,k. The relative loss in local powers (strictly speaking, the relative 
loss in the corresponding noncentrality parameters) can be evaluated as the 
ratio of (3.5) and the noncentrality parameter one would obtain for the 
specified-o" test statistic (3.4) — namely, the sum of (3.5) and the noncentral- 
ity parameter in Proposition 3.1(i). This relative loss no longer depends on 
A and takes the form 

(3_g) {k + 2){Mh)-k^) 



3A;(Jfc(/i)-A;2) + 2A;2(A;-l)' 

an increasing function of c7fc(/i)) with lower and upper bounds and {k + 
2) /3k, corresponding to arbitrarily heavy- and arbitrary light-tailed distri- 
butions, respectively. Indeed, these bounds can be obtained, for example, 
by letting 77 — > and 77 — > 00, respectively, in the power-exponential family 
of distributions considered in Section 1.2. We refer to [18] for more general 
results on efficiency losses in the related problem of estimating the shape 
parameter. 

Some numerical values of those relative losses (3.6) are provided in Table 1 
where we consider: 
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Table 1 

Numerical values of the relative power losses in (3.6) under k-variate power-exponential 
densities (with rj = 0.1, 0.5, 1, 2, 5, along with the limiting values obtained for rj —> and 
— > ooj, and under k-variate Student densities (with v degrees of freedom, v = \, 3, 5, 8, 
15, along with the limiting values obtained for i/ — > and v ^ oo), for fe = 2, 3, 4, 6, 10 

and for k oo 



Parameter 77 of the power-exponential density 



fe 





0.1 


0.5 


1 


2 


5 


— > 00 


2 


0.000 


0.154 


0.400 


0.500 


0.571 


0.625 


0.667 


3 


0.000 


0.072 


0.238 


0.333 


0.417 


0.490 


0.556 


4 


0.000 


0.045 


0.167 


0.250 


0.333 


0.417 


0.500 


6 


0.000 


0.025 


0.103 


0.167 


0.242 


0.333 


0.444 


10 


0.000 


0.013 


0.057 


0.100 


0.160 


0.250 


0.400 


00 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 


0.333 






Degrees of freedom of the underlying t density 









1 


3 


5 


8 


15 


— > 00 


2 


0.000 


0.250 


0.375 


0.417 


0.444 


0.469 


0.500 


3 


0.000 


0.111 


0.200 


0.238 


0.267 


0.294 


0.333 


4 


0.000 


0.063 


0.125 


0.156 


0.182 


0.208 


0.250 


6 


0.000 


0.028 


0.063 


0.083 


0.103 


0.125 


0.167 


10 


0.000 


0.010 


0.025 


0.036 


0.047 


0.063 


0.100 


00 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 


0.000 



(a) the family of power-exponential densities (providing a full range of 
tail behaviors), with relative loss {k + 2)rj/ {k{k + 3r/ — 1)) and 

(b) the more familiar heavy-tailed Student densities with u degrees of 
freedom (including the Gaussian as — > 00), with relative loss v /{k{k — 

1)), 

respectively, for several values of the space dimension k. Limits as A; — > 00 
are taken for fixed v or 77; note that the = 1 power-exponential and v = co 
Student columns both correspond to the Gaussian case. 

3.3. Optimal Gaussian tests for shape. The parametric tests (j)'^^ de- 
scribed in part (ii) of Proposition 3.1 achieve local and asymptotic opti- 
mality at radial density /i but are generally not valid when the underlying 
radial density is gi 7^/1. If correctly formulated, the Gaussian version of 
these tests (obtained for /i = 0i, where (f)i was defined in Section 1.2) is an 
interesting exception to this rule and can easily be written in a form that 
remains valid under the class of all radial densities gi such that gik has finite 
fourth-order moments. 
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'^fc(5i) = 7^-FwrT-i- 



Denote by 1)^(91 ) := E[iG^^ {U)f] and Ekigi) ■.= E[{G~i^{U))^] < oo, where 
U stands for a random variable with uniform distribution over (0,1), the 
second- and fourth-order moments of gik, respectively, and assume that 
Ek{gi) < oo [hence also that Dk{gi) < oo]. These two quantities are closely 
related to the kurtosis of the elliptical distribution under consideration. To 
be precise, the kurtosis 3Kfc((7i) of an elliptically symmetric random A:-vector 
X = (Xj) with location center 6 = (^i, . . . ,9^)' , scale o"^, shape matrix V, 
and radial density gi is defined to be 

see, for example, [1], page 54, [38] or [50]. This quantity depends only on 
the dimension k and the radial density (71, not on i or on the other param- 
eters characterizing the elliptical distribution (which of course justifies the 
notation); it is related to -Dfc(5i) and Ek{gi) by the simple relation 

k Ekjgi] 
k + 2Dl{g,) 

At the A:-variate Gaussian distribution and t-distribution with v degrees 
of freedom (z^ > 4), this kurtosis parameter takes values Kk{4>i) = and 
i^k{fl,u) = 2/(z^ - 4), respectively. 

The Gaussian version of the efficient central sequence for shape ("i^) 
can be written as A*^_l^\'d) = afccr^^Tg^v, where 

n 

T,,v = <i:=in-V2M,(V^2)-i/2j^-(V«2)-i/2^^^^((X,-0)(X,-0)O. 

i=l 

- {"■) 

It is convenient to work with Tg^v and an estimate T of its asymptotic 
covariance rather than with A^^"^ (i?) and an estimate of the corresponding 
information matrix since the scalar factor a/jO""^ in the quadratic form in 
A*^^\'&) cancels out. For optimality (at Gaussian radial densities), it is suf- 

- (") 

ficient for T to consistently estimate the asymptotic covariance of T^^Vo 

under U.^lPe^ia^Vo;,^!}- 
Letting 




with the same di = (i^"^(0, Vo)'s as in Section 3.2, it is easy to check that f ^ ^ 
provides, for all 9, a consistent estimate of the asymptotic variance of Te^Vo; 

not only under Ua^{'P^0l2^y^^.^J, but also under \J^2 Ugi{P'el^,Vo;gJ^ where 
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the union is taken over the set of all densities gi such that Ek{gi) < oo. The 



Gaussian test statistic then takes the form Qj\f ■ 
Lemma 3.1 and standard algebra yield 



(3.7) 



with the same Uj = Uj-"^(0, Vq) as in Section 3.2. Now, defining 



1 



1/2, 



and letting := [k{n-^ J27=i dt)]/[{k + 2){n-^ E^=l d^?] - 1 be a consis- 
tent estimate of the kurtosis parameter Hkigi)^ (3.7) takes the form 



(3.8) : 



2(Er=i<) 



trS^ 



1 

k 



■tr^S 



1 



nk 



1 + 2 



S 

tTs 



k 



It is straightforward to check that Qj^ is invariant under rotations, scale 
transformations and reflections (with respect to 0, in the metric associated 
with Vq), but that it is not (even asymptotically) invariant under the group 
of monotone continuous radial transformations (see Section 4.1 below). The 
following proposition summarizes the asymptotic properties of the Gaussian 
procedure based on Qj^: 

in) 

Proposition 3.2. Denote by (pj^' the parametric Gaussian test reject- 
in) 

ing the null hypothesis Tio '■ V = Vq whenever Qj^ exceeds the a upper- 
quantile of a chi-square distribution with k{k + l)/2 — 1 degrees of freedom. 
Then (unions over gi are taken over all densities such that gn. has finite 
fourth-order moments): 

(i) is asymptotically chi-square with k{k + l)/2 — 1 degrees of free- 

dom under Uo-a U^ilPe^CT^ Vo-gi ^ ^''^^ asymptotically noncentral chi-square, 
still with k{k + l)/2 — 1 degrees of freedom, but with noncentrality parameter 

tr((Vo-V)2)-i(trVoV)^ 



2{1 + Kk{gi)) 



under [jo{P^^\ }; 

(ii) the sequence of tests under {j^2{jg_^{P^Q'^^2 ^^.g^ has asymptotic 
level a and is locally and asymptotically maximin- efficient, still at asymp- 
totic level a, for U(t2 Uf/ilPe"^^ Vo-giJ" '^fi"^^^'^^ alternatives of the form 
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Proof. See Appendix (Section A. 3). □ 

For Vq = Ifc, the test statistic Qj^f in (3.8) and Proposition 3.2 actually 
appears as a modification of the test statistic 





S 




2 nk^ 














\{- 


-HI 


2 








AtrS 





proposed by John [24, 25]. The only difference is that Qjohn relies on the 
Gaussian value k = of the kurtosis parameter, whereas Qj^ instead involves 
an estimation k^"^^ of the same, which makes the asymptotic null distribu- 
tion of Qjsf agree, under any elliptical distribution with finite fourth-order 
moments, with the limiting distribution of Qjohn in the multinormal case. 

This adjustment is very much in the spirit of the Muirhead and Water- 
naux version [38] of Mauchly's Gaussian likelihood ratio test [36] — probably 
the most widely used test of sphericity. Muirhead and Waternaux [38] actu- 
ally show that the limiting distribution of (— 21og A'^"'')/(l + K,k{gi)), where 
— 21ogA(") is the Gaussian likelihood ratio test statistic, is asymptotically 
chi-square, with k{k + l)/2 — 1 degrees of freedom, under U^2 Ugi{P0"^2 
(the union is taken over all gi such that gik has finite fourth-order moments); 
the population kurtosis parameter Kk{gi) can of course be replaced by its 
sample counterpart k^"^ without modifying the asymptotic chi-square dis- 
tribution. These results straightforwardly extend to the problem of testing 
for a specified shape Vq rather than for sphericity. It also follows from [38] 
that the adjusted version of John's test statistic, namely our Gaussian test 
statistic Qj\f, is asymptotically equivalent to their adjusted version of the 
Mauchly test. In the sequel, the expression "optimal parametric Gaussian 
test" will refer to any of these tests. Note, however, that optimality here 
follows from Proposition 3.2 and is therefore of an asymptotic nature. Actu- 
ally, only John's original (nonadjusted) test [24] enjoys some finite-sample 
optimality properties (restricted to the Gaussian case), being locally most 
powerful invariant at the multinormal distribution. Our adjusted tests in- 
herit, under weaker asymptotic form, this optimality property from John's 
test; on the other hand, they remain valid under non-Gaussian densities, 
which is not the case for John's. 

4. Rank-based tests for shape. 

4.1. Rank-based versions of efficient central sequences for shape. As al- 
ready mentioned, the problem with tests based on efficient central sequences 
is that (with the exception of the adjusted Gaussian tests described in Sec- 
tion 3.3) they are only valid under correctly specified radial densities. In 
practice, a correct specification fi of the actual radial density gi is rather 
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unrealistic and thus the problem has to be treated from a semiparametric 
point of view, where gi plays the role of a nuisance. 

Within the family of distributions [jf^2\J^^[jg^{P^g^^2 ^^.g^}, where 6 is 
fixed, consider the null hypothesis 'Ho{0, Vq) under which V = Vq. Through- 
out, therefore, 9 and V = Vq are fixed, and o"^ and the radial density gi 
remain unspecified (no moment assumptions are being made here). As we 
have seen, the scalar nuisance can be taken care of by means of a simple 
projection, yielding the efficient central sequence. In principle, the infinite- 
dimensional nuisance gi can be treated similarly, by projecting central se- 
quences along adequate tangent spaces; see Example 4 of [7]. This approach 
is rather technical, however. Hallin and Werker [20] showed that appropriate 
group invariance structures allow for the same result by conditioning central 
sequences with respect to maximal invariants such as ranks or signs. This is 
the approach we also adopt here. 

Clearly, the null hypothesis TCoi^, ^o) is invariant under the following two 
groups of transformations, acting on the observations Xi, . . . , X„: 

(i) the group ^°''*'^{"),o — g^^^l^'^\ o of Vq- orthogonal transformations 
(centered at 0) consisting of all transformations of the form 

Xi-^go(Xi,...,X„) 

= go{e + di{e, Vo) v^'ui (0, Vo), . . . , + dn{e, Vo)v;;/'u„(0, Vo)) 

:= {B + di{e, Vo)Vo/'oUi(0, Vo), . . . , + d„(0, Vo)Vo/'oU„(0, Vq)), 

where O is an arbitrary k x k orthogonal matrix. In particular, this group 
contains "rotations" (in the metric associated with Vq) around 0, as well 
as the reflection with respect to 6, that is, the mapping (Xi,...,X„) i-^ 
(0_(Xi-0),...,0-(X„-0)); 

(ii) the group Q^'^\o :— Q^^^^^o of continuous monotone radial transfor- 
mations, of the form 

Xi-^g/i(Xi,...,X„) 

= gh{e + di {e, Vo) Vt^/'ui {e,Yo),...,e + dn{e, Vo) vy'u„(0, Vo)) 

:= (e + /i(di(0, Vo))vJ/'Ui(0, Vo), ...,e + /i(d„(0, Vo))Vy'u„(0, Vq)), 

where h : IR+ — > M"'" is continuous, monotone increasing and such that h{0) = 
and lim^^oo ^('') = cxo. In particular, this group includes the subgroup of 
scale transformations (Xi, . . . ,X„) i-^ (0 + a(Xi — 0), ... , 6 + a(X„ — 6)), 
a>0. 

Clearly, the group Q^'^\o of continuous monotone radial transformations is 
a generating group for the family of distributions IJo-a [jf-^{P^Q^^2 Vo/i J'' ^^^^ 
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is, a generating group for the null hypothesis 'Ho{6, Vq) under consideration. 
The invariance principle therefore leads to the consideration of test statistics 
that are measurable with respect to the corresponding maximal invariant, 
namely the vector (i?i(6>, Vq), . . . , ii„(6>, Vq), Ui(6', Vq), . . . , U„(6', Vq)), where 

Vq) denotes the rank of di{6,\'o) among Vq), . . . Vq). The 

resulting signed rank test statistics are (strictly) invariant under ^^"^,0, 
hence distribution- free under 7^o(^) Vq). 

Now, in the construction of the proposed tests for the null hypothesis 
Tl(){9,\'()), we intend to combine invariance and optimality arguments by 
considering a (signed-)rank-based version of the /i-efficient central sequences 
for shape [recall that central sequences are always defined up to op(l) — 
under P^j^ , as n ^ 00 — terms] . The signed-rank version A {■&) of the 

shape-efficient central sequence Aj|"'*(i9) we plan to use in our nonpara- 
metric tests is the /i -score version (based on the scores K = Kf^) of the 
statistic 



^(n) 



(^) := in-V2M,(V^2)-V2j- ;^;^(^_^^ ,ec(U.U^) 
= in-V2M,(V-2)-V2f:i^f-^) vec(u.U^ - il. 



(4.1) 



i=l 



in-i/2Mfc(V^2)-i/2 

=<i:('<-(;^)™(u.u;)-!!^,ecft)), 



where Ri = r\^\9, V) denotes the rank of di = d^"^ {6, V) among di, . . . , dn, 

U, = uS") (6>, V) and mj^^ := 7i-^ ELi Kii/{n + 1)). 

Beyond its role in the derivation of the asymptotic distribution of the 
rank-based random vector (4.1), the following asymptotic representation re- 
sult shows that Aj"^(i?) is indeed another version of the efficient central 

sequence A^|"^('i9). 

Lemma 4.1. Assume that the score function A': (0,1) ^ M is continu- 
ous, square integrable and that it can be expressed as the difference of two 
monotone increasing functions. Then, defining 

(4.2) := in-V2M,(V«2)-V2j-f^KfGuY§)) vec(U.UD, 
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we have A^i'd) = A^"' + 0^2(1) as n 



goes to infinity, under P^^^ • 



Proof. See Appendix (Section A. 3). □ 

4.2. The proposed class of tests. Let K : (0, 1) — > M be some score func- 
tion as in Lemma 4.1. Writing E[K{U)] and E[K'^{U)] for jQK{u)du and 
Jq K'^[u) du, respectively, the K-score version of the statistics we propose 
for testing TYq : V = Vq is 



Qk = Q^k^ 



(4.3) 



k{k + 2) 
■~ 2nE[A'2(C/)] 



n + 1 



77, + 1 



where Ri = i2|"^(0, Vq) and Ui = u["V, Vq). Letting 



n 



i=l 



Ri 



71 + 1 



these test statistics can be rewritten as 



^tr^S 



k{k + 2)E?[K{U)]nk'^ S 



is: 



(4.5) 



k{k + 2) 



K 



2E[i^2(^)] 



n 



-1/2 



E^ 



tr Sa' 

Ri 



1=1 



n+ 1 



+ op(l) 



+ op(l) 



as 71 goes to infinity, under any distribution [cf. (3-8)]. These test statistics 
are strictly invariant under ^°''*'^("),o as well as under Q^'^\o. They admit 
(up to a multiplicative constant) an interesting interpretation as the sum of 
squared deviations of the eigenvalues of S^- from their arithmetic mean. 

The power functions Ka{u) = tx", o > 0, provide some traditional score 
functions. The corresponding test statistics are 



(4.6) 



(2a + l)A:(/c + 2) " / , 1\ 

277(77 + 1)2^ ^R,R,[^,V,lJj) 



Important particular cases are the sign-, Wilcoxon- and Spearman-type test 
statistics, defined by Qs ■= Q Kq, Qw ■= Q Ki and Qsp '■= Q K2-: respec- 



tively. In general, the resulting tests are not optimal at any density (they 
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sometimes are, though — for instance, the Wilcoxon test Qyy is optimal in 

dimension k = 2 at Student densities with two degrees of freedom; see Sec- 
tion 4.3), but they nevertheless yield good overall performances and are 
simple to compute. The sign test statistic Q s essentially coincides with 

that proposed by Ghosh and Sengupta [13] where, however, the UlUj are 
compared from Randies' interdirections (see [46]). 

Local asymptotic optimality under radial density /i is achieved by the 
test based o\\ Q := Q Kj ■ This test statistic takes the form 



2njfc(/i 



Ri 



K 



(U^U,)^ 



which, letting S/, = S}"^ := (1/n) YJLi %i {Ri/{n + l))UiU^, simplifies to 



Qh 



(4.8) 



nk{k + 2) 
2Jk{fi) 



trS^^-itr^S 



/i 



k{k + 2) nk"^ 



JkUi 



tr S k 



+ op(l) 



as n goes to infinity, still under any distribution. The van der Waerden 
(Gaussian scores fi = (pi) test, for instance, is based on the statistic 



where stands for the chi-square distribution function with k degrees of 
freedom. See (4.10) for the rank-based test statistics based on Student scores. 
In order to describe the asymptotic behavior oiQ k and Q , we will need 

the quantities 

Jk{K;gi):= I K{u)Kg^{u)du and Jk{fi, gi) := [ K f^{u)Kg^{u) du 
Jo Jo 

[>Jkifi,gi) can be interpreted as a measure of cross- information] . 

Denote by (p^^^ (resp. by j"^) the rank-based test which consists in reject- 
ing TLq : V = Vq as soon as qS^\ defined in (4.3) [resp. qJ."^ defined in (4.7)] 

exceeds the a-upper-quantile of a chi-square distribution with k{k -|- l)/2 — 1 
degrees of freedom. We can now state the main result of this paper. Note 
that here the unions over gi extend over all possible standardized radial 
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densities: contrary to the Gaussian tests described in Section 3.3, where fi- 
nite fourth-order moments are required, the tests (f)^^ and (/>j"^ are vahd 

without any moment restrictions. 



Proposition 4.1. Let K he a continuous, square integrable score func- 
tion defined on (0, 1) that can be expressed as the difference of two mono- 
tone increasing functions. Similarly, assume that fi [satisfying Assumptions 
(Al) and (A2)] is such that Kf^ is continuous and can he expressed as the 
difference of two monotone increasing functions. Then: 



(i) Q^j^ and Q^f^ are asymptotically chi-square with k{k -\-l)/2 



1 



degrees of freedom under [J^2 [Jgi{Pg^^^2 Vq- ^} ^'^^ asymptotically noncentral 
chi-square, still with k[k + l)/2 — 1 degrees of freedom hut with noncentrality 
parameters 



JiiK;gi) 



and 



2k{k + 2)E[K^{U)] 



1 



tr((Vo-iv)2)--(trVo-iv 



2kik + 2)Jk{fi) 
>(") 



tr((Vo-iv) 



-(trVoV) 



respectively, under [Ja2{P g^^2y^+n-^/2^.gJ i 



ii) the sequences of tests (j)^^ and (pK'' have asymptotic level a under 



(n) 



/i 



Uo-2 UgilPe^CTa.Voigi}' 



(iii) the sequence of tests ^j""* is locally and asymptotically maximin- 



efficient, still at asymptotic level a, for Uo-^ UgilPe^^^ Vq-si J' ^ff^^'^-^^ alter- 
natives of the form [j^2 Uv^Vo{P0"a2,V;/i}- 



Proof. See Appendix (Section A. 3). □ 



Throughout the paper, our rank-based tests are described in terms of ap- 
proximate critical values based on asymptotic chi-square null distributions. 
Of course, exact critical values could also be considered. These exact values 
can easily be simulated by sampling the n! possible values of the vector of 
ranks and by independently generating uniformly distributed (over the unit 
sphere) signs. 
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4.3. Asymptotic relative efficiencies. Propositions 3.2 and 4.1 allow the 

(n) 
/i 



computation of ARE values for i;^^^ (hence, for ^i"^) with respect to the 



adjusted John test (p^j!J^ (therefore, also with respect to the adjusted Mauchly 
test) as ratios of the noncentrality parameters in the asymptotic distribu- 
tions of their respective test statistics under local alternatives, for various 
radial densities gi. These adjusted tests are still not valid unless Kkidi) < oo 
and, therefore, our ARE values also require finite fourth-order moments. 

(n) 

Recall, however, that the signed rank tests <j) remain valid without such 

moment assumption so that, when gi is such that Kkidi) = oo, the asymp- 
totic relative efficiency of any </)^^ with respect to 4>j!^^ can actually be 

considered as being infinite. 

Proposition 4.2. Let K satisfy the assumptions of Proposition 4.1. 
Then the asymptotic relative efficiency of (j) k with respect to the parametric 

Gaussian test(j)j^, under radial density gi satisfying Assumptions (Al), (A2) 
and Kk{gi) < oo, is 

For K of the form Kf^ , this yields 

1 Ekigi) Jiifugi) 



{k + 2YDl{g,) Mfi 



In order to investigate the numerical values of these AREs, we consider 
the tests (pft based on tj/-scores, that is, the scores associated with the 
Student radial densities introduced in Section 1.2. One can easily check that 
r/{u + ak,„r'^). Also, since afc,i.||Xi |p//c, under Po,i,ife;/j' ,/ 
is Fisher-Snedecor with k and ly degrees of freedom, one can show that the 
test statistic Q ft takes the form 



(4.10) 



k^{k + u){k + u + 2) 
' ft = 

n rj,(n) rp{n) 

X E i , ^( (U-U,f - \ 



[see (2.2)], where, denoting by G^^l, the Fisher-Snedecor distribution func- 
tion with k and v degrees of freedom, we let t/"^ := G^]^{Ri/{n + 1)). Note 
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that the sign test and the van der Waerden test are obtained by letting 
u ^ and — > oo, respectively. An easy calculation also shows that for 
u = 2, Qt^ and Q Ka coincide for a = 2/k, k = 2,3,4, ... . Hence, for k = 2, 

the Wilcoxon test statistic Qiy is optimal at Student densities with two 

degrees of freedom. 

Numerical values of the AREs of several of the proposed rank-based tests 
with respect to the Gaussian test, under various ti^ and normal densities, 
are given in Table 2. For the sign test 05, closed- form expressions are 

kiu — 2) k 
^^^feJL ^t^/M = ^fc_^2)(t/-4) AREfc,0^ [(t>s/(t>J\r] = -^-p^ • 

[recall that ^^(/f j,) < oo iff > 4, which is the condition for a Student radial 
density to satisfy Ek{fi y) <oo\. Also, the highest ARE with respect to the 
Gaussian test (j)j\f that can be achieved under is 

AREi, ft ft /4>f^] = --^ — ^ — ^^-7 

The ARE values in Table 2 are all uniformly good, especially for the 
van der Waerden test i;^vdW) for which they are not only uniformly larger 

than 1, but also uniformly larger than the corresponding AREs for location — 
namely, the AREs of van der Waerden rank tests with respect to the classical 
Hotelling ones when testing that the center of symmetry 6 of an elliptical 
distribution is equal to some fixed 6q, as in [17]. This Pitman dominance of 
'/'vdw over (j)j^f also holds under lighter-than-Gaussian radial tails, as can be 

checked by again considering the power-exponential radial densities defined 
in Section 1.2; for instance, in the problem of testing for trivariate sphericity, 
the corresponding AREs are 1.166, 1.014, 1.000, 1.039, 1.108 and 1.183 for 
?7 = 0.5, 0.8, 1, 1.5, 2 and 2.5, respectively. Actually, it can be shown [43] 
that this is a general property and that i;^>vdW5 from the Pitman point of 

view, uniformly dominates its Gaussian parametric competitors. 

4.4. Unspecified location 9. In practice, the center of symmetry 9 is 
seldom specified and must be replaced, in test statistics, with an estima- 

- (n) 

tor 9 = 6 . Under very mild conditions, any root-n consistent estimator 
will be adequate (in principle, after due discretization), but we recommend 
the (rotation-equivariant) spatial median (see, e.g., Mottonen and Oja [37]), 
which is itself "sign-based." 

The asymptotic impact of this substitution on the validity of the signed 
rank tests proposed in Section 4.2 could be studied directly (see, e.g., [45]), 
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Table 2 

AREs of the Iq-, van der Waerden-, sign- and Wilcoxon-score rank-based tests for shape 
and (in parentheses) location, with respect to the corresponding parametric Gaussian 
tests, under k-dimensional Student (1, 3, 4, 5, 8, 15 and 20 degrees of freedom) and 
normal densities, respectively, for fc = 2, 3, 4, 6 and 10 



Degrees of freedom of the underlying t density 



V k 


1 


3 


4 


5 


8 


15 


20 


00 


0*6 2 


+00 


+00 


+ OQ 


2.331 


1.248 


1.045 


1.013 


0.957 




(+oo) 


(2.067) 


(1.484) 


(1.294) 


(1.107) 


(1.009) 


(0.986) 


(0.927) 


3 


+00 


+ CX) 


+ CXD 


2.398 


1.267 


1.052 


1.018 


0.957 




(+00) 


(2.174) 


(1.540) 


(1.331) 


(1.124) 


(1.014) 


(0.988) 


(0.919) 


4 


+00 


+00 


+00 


2.453 


1.284 


1.058 


1.023 


0.958 




(+00) 


(2.258) 


(1.584) 


(1.361) 


(1.139) 


(1.019) 


(0.990) 


(0.913) 


6 


+00 


+00 


+00 


2.537 


1.311 


1.070 


1.031 


0.959 




(+00) 


(2.382) 


(1.652) 


(1.408) 


(1.163) 


(1.028) 


(0.995) 


(0.905) 


10 


+00 


+00 


+00 


2.646 


1.349 


1.087 


1.044 


0.963 




(+00) 


(2.534) 


(1.736) 


(1.468) 


(1.196) 


(1.043) 


(1.005) 


(0.896) 


(j> vdW 2 


+00 


+ 00 


+00 


2.204 


1.215 


1.047 


1.025 


1.000 




(+00) 


(1.729) 


(1.301) 


(1.171) 


(1.060) 


(1.016) 


(1.009) 


(1.000) 


3 


+00 


+00 


+00 


2.270 


1.233 


1.052 


1.028 


1.000 




(+00) 


(1.798) 


(1.336) 


(1.194) 


(1.069) 


(1.019) 


(1.011) 


(1.000) 


4 


+00 


+00 


+ CXD 


2.326 


1.249 


1.057 


1.031 


1.000 




(+00) 


(1.853) 


(1.364) 


(1.212) 


(1.077) 


(1.022) 


(1.012) 


(1.000) 


6 


+00 


+00 


+00 


2.413 


1.275 


1.066 


1.036 


1.000 




(+00) 


(1.935) 


(1.408) 


(1.242) 


(1.092) 


(1.027) 


(1.016) 


(1.000) 


10 


+00 


+00 


+00 


2.531 


1.312 


1.080 


1.045 


1.000 




(+00) 


(2.041) 


(1.467) 


(1.283) 


(1.112) 


(1.035) 


(1.021) 


(1.000) 


0s 2 


+00 


+00 


+00 


1.500 


0.750 


0.591 


0.563 


0.500 




(+00) 


(2.000) 


(1.388) 


(1.185) 


(0.984) 


(0.877) 


(0.851) 


(0.785) 


3 


+00 


+00 


+00 


1.800 


0.900 


0.709 


0.675 


0.600 




(+00) 


(2.162) 


(1.500) 


(1.281) 


(1.063) 


(0.947) 


(0.920) 


(0.849) 


4 


+00 


+00 


+00 


2.000 


(1.000 


0.788 


0.750 


0.667 




(+00) 


(2.250) 


(1.561) 


(1.333) 


(1.107) 


(0.986) 


(0.958) 


(0.884) 


6 


+00 


+00 


+ OQ 


2.250 


(1.125 


0.886 


0.844 


0.750 




(+00) 


(2.344) 


(1.626) 


(1.389) 


(1.153) 


(1.027) 


(0.997) 


(0.920) 


10 


+00 


+00 


+ OQ 


2.500 


1.250 


0.985 


0.938 


0.833 




(+00) 


(2.422) 


(1.681) 


(1.436) 


(1.192) 


(1.062) 


(1.031) 


(0.951) 


(j)w 2 


+00 


+00 


+00 


2.258 


1.174 


0.956 


0.919 


0.844 




(+00) 


(1.748) 


(1.317) 


(1.185) 


(1.066) 


(1.015) 


(1.005) 


(0.985) 


3 


+00 


+00 


+00 


2.386 


1.246 


1.022 


0.985 


0.913 




(+00) 


(1.621) 


(1.233) 


(1.117) 


(1.019) 


(0.983) 


(0.978) 


(0.975) 


4 


+00 


+00 


+00 


2.432 


1.273 


1.048 


1.012 


0.945 




(+00) 


(1.533) 


(1.171) 


(1.064) 


(0.979) 


(0.954) 


(0.952) 


(0.961) 


6 


+00 


+00 


+00 


2.451 


1.283 


1.060 


1.026 


0.969 




(+00) 


(1.422) 


(1.090) 


(0.994) 


(0.921) 


(0.908) 


(0.911) 


(0.938) 


10 


+00 


+00 


+00 


2.426 


1.264 


1.045 


1.013 


0.970 




(+00) 


(1.315) 


(1.007) 


(0.919) 


(0.855) 


(0.851) 


(0.857) 


(0.907) 
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but is more conveniently handled via Le Cam's third lemma, which allows 
the derivation of the asymptotic distribution under P^"^2 v ^^^^ 

(n) (n) 

statistic Q ='■ Q K-e considered in Section 4.2, but computed at 6 instead 

of 6. This lemma applies in the parametric location experiment £g^^ := 
{P^g^^2 vgil^ ^ M'^}, provided that it is ULAN, which essentially requires gi 
to satisfy Assumption (Al) (see [17]). 

The asymptotic distribution, as n — > cxo, of Q^.0^„-i/2T-{n) under Pg"'^2 v 

for any bounded sequence r^") is the same as under Pg"^^_i/2^{„) ^2 v I'^^^-' 
in view of part (i) of Proposition 4.1, chi-square with k{k + l)/2 — 1 degrees of 
freedom], provided that the asymptotic joint distribution, under P^g^^2 vgi' 
of A^.''^^(i9) [defined in (4.2)] and the central sequence for location A^"j]^(i9) 

in iSg"^ [as defined in (2.5)] is normal with block-diagonal asymptotic co- 
variance. Now, this is automatically satisfied under the assumptions made 
on K: indeed, both A^"g''^(i9) and A^"j*;^(i?) are sums of i.i.d. vectors with 

finite variances and, in view of the independence under ^^^^^2 v-gj between 

4"V,V) and U5")(6>,V), have a cross-covariance matrix proportional to 
E[vec(UjU9U^] = 0. Classical reasoning then extends this to random se- 
quences of the form r^") = n^/'^{G — 0), where ■n}/'^{G — G) is Op(l) and is 

(n) 

locally discrete, that is, such that the number, under Pg^2 v gi' °^ P°^" 
sible values in balls of the form {z £ R'^I ||z — < 6^} remains bounded as 
00. It is well known that this latter assumption has no practical con- 
sequences (see, e.g., [30]). The null distribution of Q'^^q is thus the same, 

(n) 

then, as that of Q j^.q- 

However, Le Cam's third lemma only provides asymptotic equivalence 
in distribution results. Asymptotic equivalence in probability [i.e., a result 
of the form Q^^^ — Q^k^b = op(l)] under P0"^2 requires more stringent 

asymptotic linearity results, such as those in Proposition A.l of [16], or 
more general methods, such as the one recently developed by Andreou and 
Werker [2]. 

(n) 

Note that Q ^ g is no longer strictly invariant or distribution-free, but re- 
mains asymptotically so, in the sense of being asymptotically equivalent to 

(n) 

its genuinely invariant and distribution-free counterpart Q x-e- This asymp- 
totic equivalence carries over to contiguous alternatives so that local optimal- 
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ity properties are also preserved. Incidentally, note that Q is translation- 
invariant whenever is translation-equivariant. 
5. Validity and consistency properties. 

5.1. Null hypothesis: sphericity or unit shape! Our rank tests are basi- 
cally intended for the null hypothesis of sphericity — not for the hypothesis 
of isotropy, nor for that of unit shape. Indeed the (asymptotic) size oi (j) k 

does not, in general, match the nominal a-level under nonelliptical densities, 
even for unit shape matrices V = 1^ . 

One important exception to this general rule is the multivariate sign test 
(/)5, based on the test statistic [with scores K{u) = 1] Q s ■= Q Ko given in 

(4.6). This test in [13] is described as a test of sphericity. However, since the 
ranks are not involved, (j)s remains valid under the hypothesis of isotropy 

and hence (since only the centering and second-order structure of the matri- 
ces UjU^ matter) under the hypothesis of unit shape with isotropic fourth- 
order moments, that is, provided that the moments of the signs Uj coincide 
with those of the uniform distribution over the unit sphere in M'^ up to order 
four, so that 

E[U,U^] = ylk 



and 



E[vec(UiUO(vec(UiUO)'] 



1 



Ifc2 + Kfe -I- 



k{k + 2) 

The validity of this test can be extended to the whole hypothesis of unit 
shape if estimated moments of order four are substituted for the isotropic 
ones, yielding the adjusted sign test cpg, based on the statistic 



Q*s:={ Evec(U.U:) - ^ vec(I.) (V^^^/'m: 



(5.1) 



.1=1 



-1/2 



1=1 



E( (vec(U.U:))(vec(U.UO)' - ^J, )(V«2)-i/2m; 



X Mfc(V®2)-V2 ^vec(U,U^) - %ec(Ifc) 



vi=l 
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Unfortunately, the benefits of Lemma 3.1 are lost and the adjusted test 
statistic Q*g does not retain the elegant and simple structure [cf. (1.4) and 

(1.5)] of John's test. 

One is tempted to apply a similar idea to our rank-based tests (j) k- An 

estimate of the covariance matrix of A j^-* (i?) that does not exploit the el- 
liptical independence between the ranks and the signs is indeed quite pos- 
sible. But the expectation of X^^^^ if(i?j/(n -|- 1)) vec(UjU9 [reducing to 
Ya=i K{i /{n + 1)) vec(Ife) under sphericity] is no longer distribution-free 
if the assumption of ellipticity is abandoned, and replacing this expectation 
with an empirical centering would induce a noncentrality parameter in the 
asymptotic null distribution of the test statistic Q k- 

From the point of view of (asymptotic) validity and with the exception of 
the multivariate sign test [the adjusted version (5.1)] which is a test of unit 
shape, our rank-based tests (j) k thus only qualify for the null hypothesis of 

sphericity. 



5.2. Nonlocal alternatives: consistency issues. Validity under the null 
hypothesis not the only requirement for a test (p to qualify as a test of TYq, 
say, against 7ii, and consistency under Tii is certainly an equally important 
issue. In this respect, the larger the overarching model Ti := Ti^ -\- Hi (with 
-|- standing for disjoint union), the better the test. Although optimality 
results have been derived under an overall hypothesis H of ellipticity, the 
most "natural" Tl here should consist of the collection of all i.i.d. sample 
distributions from nonvanishing fe-dimensional densities /. 

However, the results of the previous sections are entirely local to the null 
hypothesis of sphericity and do not allow for any conclusions under nonlocal 
alternatives. Proposition 5.1 below, on the other hand, provides a charac- 
terization of consistency under nonlocal alternatives. Denote by /C("^(/) the 
hypothesis under which the observations Xj are i.i.d. with nonvanishing, pos- 
sibly nonelliptic density /. Our rank tests (p^^^ are consistent under IC^^\f) 

iff the quadratic test statistic is unbounded in probability, that is, iff, 

for any fixed q, P[Q^^ > q] ^ 1 as oo, under /C^"'^(/) or, equivalently, 
iff, for ah t > 0, 



(5.2) 



n 



-1/2 



h vn+i 



vec( UjU- - -Ifc 



>t 



OPTIMAL RANK-BASED TESTS FOR SPHERICITY 



33 



as n — i- oo, under IC^'^\f) [see (4.5)], which we unambiguously write as 

n~^^'^Y.i=iKi{Ri/in + l)))vec(Ui\J'i - ^Ifc) ^ oo as n ^ oo. We then have 
the following necessary and/or sufficient consistency conditions: 

Proposition 5.1. Assume that the score function K : (0, 1) ^ M can be 
expressed as the difference Ki — K2 of two monotone increasing, absolutely 
continuous and square integrable functions. Then: 



(i) (p^^^ is consistent iff, under }C^"\f), as n- 



00, 



K 



R 



in) 



n+ 1 



vec U,;U' 



k 



00, 



(5.3) n-V2^E 

i=l 

where b!^^ = RP{e, Ik), Ui = U,(6>,Ifc) and UH := (Ui, . . . , U„). 

(ii) // the square integrability condition on Ki and K2 in (i) is reinforced 
into 



(5.4) 



^1 

J{Ki):= u^/'^{l-uf''^dKi{u)<oo, 
Jo 



1,2 



(a classical condition that goes back to Hoeffding [22]), then 
tent iff, under /C('")(/), as 00, 



M 
'k 



IS consis- 



n 



-1/2 



EE 



(5.5) 



i=l 



X vec ( U, U' 



00. 



(iii) If the Hoeffding condition (5.4) is satisfied and, moreover, K is con- 
in) 

vex, then a sufficient condition for (f> to be consistent is that, for some i. 



either 



K 



(5.6) 



or 



(5.7) 



K 



E[/[d2 < di] vec(UiU; - {l/k)lk)} 
E[vec(UiU'i - (l/fc)Ifc)7] 

E[iC(P[(i2 < cii|(ii,Ui,U2])vec(UiU; - (1/^)1^) 
E[vec(UiU'i - {l/k)lk)j] 

W[d2 < di] vec(UiU'i - {l/k)lk)j 



>0 



E[vec(UiU'i - (l/fc)Ifc)+] 

^[K{V[d2 < (ii|(ii,Ui,U2])vec(UiU; - (lA)Ifc)7 
E[vec(UiU; - {l/k)lk)j] 



>0 
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under /C^"'^(/), where vec(UiU'i — ^ik)f stand for the positive and negative 
parts o/ vec(UiU']^ — ^Ik)e, respectively. 
(iv) The Wilcoxon test (pw based on 



Qw 



3k{k + 2) 



2n(n + l) . 

^ ' 1=1 

(4.6)] is consistent iff, under IC^^\f) , 
(5.8) E 



see 



I[d2<di]vec[lJiU[--Ik 



^0. 



(v) The adjusted sign test cpg based on (5.1) is consistent iff, under 
/CW(/), 



(5.9) 



E 



vec(^UiU'i-^I,, 



^0. 



Proof. See Appendix (Section A. 4). □ 



Note that the Hoeffding condition in (ii) only shghtly reinforces the square 
integrability condition on K: Hoeffding [22] shows that (5.4) holds as soon 
as {K {u))'^[\og{l + |-fC(u)|)]^"'~'^ du is finite for some (5 > 0, a condition that 
is satisfied by all particular score functions considered in this paper. 

These consistency results imply that our rank-based tests </)^^ (exclud- 
ing the sign test), although unrestrictedly valid under the null hypothesis 
of sphericity, are consistent under most nonspherical alternatives, which in- 
clude nonspherical elliptic, nonelliptical unit shape and nonunit-shape cases. 
For Wilcoxon scores, for instance, only the very particular densities / for 
which I[d2 < di] is orthogonal to the k{k + l)/2 variables C/i,r^^i,s — i^rs/k), 
r,s = l,...,k (drs standing for the Kronecker symbol) result in an incon- 
sistent ■ This either corresponds to E[UiU'^] ^ ik/f^ a-nd the joint dis- 
tribution of di,Ui and d2 compensating exactly for the deviations of all 
Ui^rUi^s^s from Srs/k (r, s = 1, . . . , k), or to unit shape densities under which 
I[d2 < di] and UiU'^ are uncorrelated. To the best of our knowledge, the 
only test retaining consistency under the whole nonspherical alternative is 
Baringhaus' test [5]; but the price that must be paid is that the separation 



rates are nonparametric, which entails that its ARE with respect to 



'k 



IS 



zero at elliptical alternatives. 

The situation is slightly different with the adjusted sign test. As already 
mentioned, the natural null hypothesis for this test is that of unit shape 
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and consistency is achieved at all nonunit shape alternatives, since the score 
[K(u) = 1] cannot here compensate for deviations from Ik/k of UiU-. On the 
other hand, the price to be paid in terms of efficiency at elliptical alternatives 
can be quite high: the AREs of sign tests with respect to their van der 

(n) 

Waerden counterparts (p^^^^y^ are only 0.681, 0.500 and 0.279, respectively, 

at ts, Gaussian and 63 alternatives, in dimension k = 2. 

As the dimension k of the observation space goes to 00, however, it can 
easily be shown that, for fixed n, (1)^^]^^ ~ 't'^s^ ~ '^i^) "P^^^-a.s.; this justifies 

the empirical finding that the AREs of the sign test with respect to the 
van der Waerden test converge to 1, as /c — > cxd, irrespective of the underly- 
ing distribution. Most interestingly, this convergence also implies that the 
van der Waerden test in some sense inherits, as k ^ 00, most of the nice 
validity /consistency properties of the sign test, whereas the latter, on the 
other hand, inherits the attractive efficiency properties of van der Waerden 
procedures. 

6. Simulation results. The asymptotic relative efficiencies of the tests 
(of the null hypothesis V = Vq) described in Sections 3.3 and 4.2 do not 
depend on the null value Vq of the shape matrix. Therefore, in this section, 
we concentrate on the particular case (Vq = Ifc) of testing for sphericity. 
We generated A'' = 2,500 independent samples £1, . . . ,£500 of size n = 500 
from various bivariate spherical densities (the bivariate normal and bivariate 
t-distributions with 0.2, 1 and 6 degrees of freedom, resp.), with center 
of symmetry 9 = (0,0)'. From each of these samples, we constructed four 
series of 500 spherical (for m = 0) or elliptical (for m = 1,2,3) observations 
Xi , . . . , X500, characterized by 

(6.1) Xj = (Ifc + mv)£i, m = 0,1,2,3, 

with vech V = (0, .14)'. 

Although designed against elliptical alternatives, our tests also perform 
quite well under a broad class of nonelliptical alternatives. In order to show 
this, we considered the following skew populations. Population SJ\f refers to 
samples of n = 500 observations Xi , . . . , X500 characterized by 

(6.2) Xi = {signVm,i)W^,i-E[{signVm,^)Wm■i], m = 0,l,2,3, 

where the i.i.d. vectors (^;j, W^.^)' are drawn from the trivariate standard 
normal distribution with mean and covariance matrix 

Q 5=(l + mVv)-V2^v, 

with V = (0.15,0)'. The distribution of the resulting Xj's is the so-called 
bivariate skew normal distribution with parameters 0, I2 and mv (see, e.g.. 



36 



M. HALLIN AND D. PAINDAVEINE 



[3] or [4]). Population St2 is obtained in the same way, but with trivariate t2- 
distributed vectors Wj^.J' with the same mean and covariance matrix 

as in the Gaussian case above, but v = (0.25,0)' (see [4]). 

On each of these samples, we performed the following eleven tests for 
sphericity (all at asymptotic level a = 5%): John's test [based on (3.9)], 
the Gaussian test (pj^ [based on (3.7)], the sign, Wilcoxon and Spearman 
tests [based on Qkqi Qki and in (4-6), resp.], the van der Waerden 

test (pvdw [based on (4.9)], and several tjy-score tests (j)ft (z^ = 0.2, 0.5, 1, 2 

and 6) [based on (4.10)]. Rejection frequencies are reported in Table 3. The 
corresponding individual confidence intervals (for N = 2,500 replications) at 
confidence level 0.95 have half-widths 0.0044, 0.0080 and 0.0100, for frequen- 
cies of the order of 0.05 (0.95), 0.20 (0.80) and 0.50, respectively. 

Inspection of Table 3 reveals that the Gaussian test (pj\f collapses under 
the heavy-tailed distributions and ti (which have infinite fourth-order 
moments) and confirms the fact that John's test is only valid under normal 
distributions. All rank-based tests apparently satisfy the 5% probability level 
constraint. Power rankings are essentially consistent with the corresponding 
ARE values, which we also report in Table 3. In particular, the asymptotic 
optimality of (j) ft under the Student distribution with i/ degrees of freedom 



is confirmed. The performances under elliptical and nonelliptical alternatives 
of the various procedures seem to be quite similar. 

Finally, in order to investigate the performances of our tests in very small 
samples, we generated N = 2,500 independent samples of size n = 25 based 
on (6.1) [but with vechv = (0,0.2)']. Only Gaussian and to.2 densities were 
considered. The corresponding rejection frequencies are reported in Table 4. 
Similar conclusions as in the first Monte Carlo study above hold in this 
small sample simulation. However, note that for such a small sample size, 
the asymptotic approximation seems to produce strictly conservative critical 
values for the van der Waerden- and tg-score versions of our tests. 



A.l. Proof of Proposition 2.1. Our proof relies on Lemma 1 from 
Swensen [49] (more precisely, on its extension by Garel and Hallin [12]). The 
sufficient conditions for LAN given in Swensen's result follow from standard 
arguments once it is shown that (0,(T^, V) f^e'^fj2 v-fi^^^ differentiable 
in quadratic mean, where f q ^2 v/ density in (1.1), and we therefore 

focus on this. The main step in establishing this quadratic mean differentia- 
bility is the following [here and in the sequel, all o(|| • ||) or 0(|| • ||) quantities 
are taken as |1 • |1 ^ 0]: 




APPENDIX 



OPTIMAL RANK-BASED TESTS FOR SPHERICITY 



37 



Table 3 

Rejection frequencies (out of N = 2,500 replications), under various null and nonnull 
distributions [see (6.1) and (6.2) for details], of John's test (^joim), the Gaussian 

parametric test {<j)J\r) and 
the signed-rank van der Waerden {(jjvdw), ti,-score ((fif-^ ^, v = 0.2, 0.5, 1, 2, Q), sign((^s), 

Wilcoxon-type {<j)w) and Spearman-type {(j>sp) tests, respectively; the sample size 

is 500 ("ND" means "not defined," which occurs as soon as one of the two tests 
involved is not valid under the distribution being considered; "?" indicates that 
no theoretical ARE values are available under nonelliptical alternatives) 



m 



Test 







1 


2 


3 


ARE 


<?!'john 


N 


0.0504 


0.2380 


0.6856 


0.9492 


1.000 






0.0492 


0.2348 


0.6824 


0.9492 


1.000 


0vdW 




0.0460 


0.2208 


0.6652 


0.9432 


1.000 






0.0468 


0.2260 


0.6644 


0.9404 


0.957 


^/l,2 




0.0544 


0.2052 


0.6036 


0.9028 


0.844 






0.0544 


0.1900 


0.5532 


0.8600 


0.741 


^/l,0.5 




0.0560 


0.1732 


0.5000 


0.8024 


0.648 


^/l,0.2 




0.0560 


0.1628 


0.4536 


0.7476 


0.568 


<t>S 




0.0568 


0.1484 


0.4016 


0.6908 


0.500 


4>SP 




0.0460 


0.2180 


0.6576 


0.9356 


0.934 


<Pjohn 


16 


n 1 Qos 
u. lyzo 


u.o ( iz 


U. / UID 




]\Tr» 

INU 


<t>M 




0.0480 


0.1580 


0.4528 


0.7608 


1.000 


(t>vdW 




0.0428 


0.1816 


0.5708 


0.8800 


1.531 






0.0460 


0.1956 


0.5916 


0.8956 


1.600 


^/l,2 




0.0520 


0.1904 


0.5832 


0.8860 


1.531 


^fl.l 




0.0500 


0.1836 


0.5444 


0.8588 


1.408 


^/l,0.5 




0.0464 


0.1708 


0.4980 


0.8148 


1.269 


^/l,0.2 




0.0468 


0.1480 


0.4432 


0.7648 


1.172 


4>S 




0.0488 


0.1284 


0.3884 


0.7064 


1.000 


4>SP 




0.0480 


0.1980 


0.5956 


0.8888 


1.579 


0John 


tl 


0.9868 


0.9872 


0.9848 


0.9840 


ND 






0.0060 


0.0052 


0.0064 


0.0088 


ND 


(f'vdW 




0.0432 


0.1244 


0.3620 


0.6508 


ND 


^fl.6 




0.0456 


0.1492 


0.4256 


0.7376 


ND 


^/l,2 ^^IV 




0.0480 


0.1636 


0.4668 


0.7936 


ND 






0.0468 


0.1632 


0.4724 


0.8028 


ND 






0.0460 


0.1636 


0.4700 


0.7964 


ND 


^/l,0-2 




0.0428 


0.1548 


0.4404 


0.7644 


ND 


4>S 




0.0452 


0.1408 


0.4020 


0.7064 


ND 


4>SP 




0.0488 


0.1444 


0.4092 


0.7240 


ND 
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Table 3 {Continued) 



m 



Test 







1 


2 


3 


ARE 


0John 


ta.2 


0.9468 


0.9460 


0.9460 


0.9500 


ND 






0.0196 


0.0184 


0.0252 


0.0352 


ND 


4>vdW 




0.0412 


0.0924 


0.2468 


0.4644 


ND 






0.0452 


0.1144 


0.2996 


0.5572 


ND 






0.0528 


0.1284 


0.3460 


0.6220 


ND 






0.0544 


0.1348 


0.3760 


0.6672 


ND 






0.0476 


0.1356 


0.3908 


0.6996 


ND 


^/l,0-2 




0.0500 


0.1372 


0.3940 


0.7016 


ND 


(j} s 




0.0468 


0.1296 


0.3724 


0.6764 


ND 


4' sp 




0468 


0.1056 


0.2752 


0.5100 


ND 


4>lohn 


SAf 


0.0520 


0.0624 


0.2596 


0.8000 


7 


fh \ f 




0528 


0.0664 


2600 


8000 


? 


0vdW 




0.0472 


0.0608 


0.2488 


0.7828 


? 


^/l,6 




0.0508 


0.0620 


0.2456 


0.7808 


? 


^/l,2 




0.0492 


0.0620 


0.2304 


0.7336 


7 






0.0488 


0.0608 


0.2012 


0.6784 


? 






0.0476 


0.0620 


0.1796 


0.6112 


7 


^/l,0.2 




0.0492 


0.0568 


0.1568 


0.5540 


? 






0.0512 


0.0544 


0.1412 


0.4972 


? 






0.0528 


0.0652 


0.2504 


0.7752 


? 


(/"John 


St2 


0.8640 


0.8616 


0.9044 


0.9520 


? 






0.0196 


0.0188 


0.0640 


0.1896 


? 


(T^vdW 




0.0536 


0.0740 


0.4144 


0.8504 


? 






0.0536 


0.0724 


0.4184 


0.8276 


? 






0.0512 


0.0744 


0.3592 


0.6964 


? 






0.0472 


0.0724 


0.2964 


0.5048 


? 


^/l,0.5 




0.0484 


0.0720 


0.2324 


0.3280 


? 


^/l,0.2 




0.0464 


0.0688 


0.1744 


0.2076 


? 






0.0468 


0.0604 


0.1524 


0.1556 


? 


4>SP 




0.0552 


0.0756 


0.4592 


0.8820 


? 
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Table 4 

Rejection frequencies (still out of N = 2,500 replications) under spherical and elliptic 
Gaussian and to.2 distributions, of the same tests as in Table 3; the sample size is now 25 



m 



Test 







1 


2 


3 


ARE 


(/'John 


J\ 


0/119 


U.DUoZ 


u.yzoz 


u.yoou 


i.UUU 






0.0424 


0.5848 


0.8924 


0.9708 


1.000 


0vdW 




0.0172 


0.4136 


0.8088 


0.9408 


1.000 






0.0356 


0.5280 


0.8684 


0.9628 


0.957 


^/l,2 




0.0416 


0.5400 


0.8612 


0.9584 


0.844 


^fl.l 




0.0468 


0.5036 


0.8316 


0.9432 


0.741 


^/l,0.5 




0.0496 


0.4500 


0.7924 


0.9132 


0.648 


^ 71,0-2 




0.0484 


0.4016 


0.7328 


0.8724 


0.568 


4>s 




0.0480 


0.3580 


0.6736 


0.8216 


0.500 


4>sp 




0.0396 


0.5600 


0.8856 


0.9696 


0.934 


(t>.Tohn 


to.2 


0.8652 


0.9076 


0.9360 


0.9484 


ND 






0.0004 


0.0008 


0.0016 


0.0020 


ND 


4>vdW 




0.0148 


0.1476 


0.3608 


0.5192 


ND 






0.0308 


0.2492 


0.5080 


0.6844 


ND 


^/l,2 




0.0452 


0.3288 


0.6168 


0.7968 


ND 


^/l,l 




0.0496 


0.3592 


0.6784 


0.8376 


ND 


^/l,0.5 




0.0488 


0.3824 


0.7172 


0.8584 


ND 


^/l,0.2 




0.0508 


0.3892 


0.7272 


0.8692 


ND 


(j)S 




0.0480 


0.3752 


0.7044 


0.8504 


ND 


4>SP 




0.0348 


0.2320 


0.4620 


0.6352 


ND 



Lemma A.l. Let Assumptions (Al) and (A2) hold. Define 
g^^^,^^^ (x) := CfcjJI]|-i/2/^(||x - X G M^ 

and 

X vec(?/'/i (||x - 6>||s)||x - 6>||su(6>, I])u'(6>, - Ifc), 
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where ||z||s := (z'S~iz)i/2, u(0, S) := ^"^/^(x - 0)/||x - and Pfc is 
such that P^(vechH) = vecH for any symmetric k x k matrix H = {Hij). 
Then: 



= o(||H||2) and 




To prove Lemma A.l, we need the following reformulation of Assump- 
tion (A2): 

Lemma A. 2. Assumption (A2) holds iff (i) /^^.{^p G L^(M, zy^) and (ii) 
there exists Df^/^^^ G L^(M, i^^,) s?/c/i i/iai 

/ [/i;£p(^ + h)- fl'l^{x) - h{Dflll^{x)fe^^ dx = oik'') 

as h^O. In that case, Dfll^^^ and (/i;{xp)' '^'^^ equal in L^(R, z/fc). 

The proof of this lemma relies on the following result by Schwartz (see 
[47], pages 186-188): 

Lemma A. 3 (Schwartz). The real function g is in W^i'2(M) {with weak 
derivative g' , say) iff (i) g E L^(M) and (ii) there exists Dg G L^(R) such 
that X I— g{x + h) — g{x) — h{Dg{x)) is o{h) in L^(M) (as h ^0), that is, 
J[g{x + h) — g{x) — h{Dg{x))]'^ dx = oQi^) as 0. In that case, Dg and g' 
are equal in L^(M). 

Proof of Lemma A. 2. Throughout this proof, we write / instead of 
/i cxp ^^^d o(/i)'s are taken as /i — > 0. 

(Necessity) It is easy to show that the real function x g{x) := /(x)e^^/^ 
admits the weak derivative x i— > g'{x) = f'(x)e^^^'^ + {k/2)g{x), where /' 
denotes the weak derivative of /. In view of the assumptions on /, both g 
and g' are in L^(]R). Lemma A. 3 therefore yields that x i— > Mh{x) := g{x + 
h) - g{x) - hg'{x) is o{h) in ^^(M). But Mh = Ih + A + ^h. + ^h, where 

h{x) := {fix + h)- f{x) - hf'{x))e^^l\ 
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Jh{x) := fix + /j)e'=(^+'»)/2e-'=V2(gfcV2 _ i _ ^^/2), 
Kh{x) := {fix + /ije^-^^+'^^Z^ _ f^x)e'''=/^)hk/2 

and 

L;^(x) := fix + /j)e'=(^+'^)/2(e-^V2 _ i)/jfc/2. 

Since J^, -fC/i and L/i are also o(/i) in L2(]R), so is Ih- 

{Sufficiency) Assume now that / G L'^{R,i'k) is such that x i-^ Ih{x) := 
(/(x + /i) - /(x) - hDf{x))e'''-'/^ is o(/i) in L'^{R) for some £»/ G L^{R,Uk) 
and again define x ^ g{x) := f{x)e^'^/^ [g G ^^(M)]. With ^^(x) := D/(x)e'=^/2_^ 
(/e/2)5(x) [1)5 G L2(M)], we have that 

x 1-^ Mh{x) := g{x + h) — g{x) — hDg{x) 

= {f{x + h)- f{x) - /iD/(x))e'=^'/2 + J,^{x) + Kh{x) + Lh{x) 

is o{h) in L2(M). Lemma A. 3 thus yields that Dg is the weak derivative 
of g] this implies that, for all infinitely differentiable compactly supported 
functions (/?, 

j [v?(x)e-^'^-/2][Z)/(x)e'=^/2 ^ {k/2)g{x)]dx 

= - j [(/^'(x)e-^'^/2 _ {k/2)^{x)e~^''l'^][f{x)e^''l'^]dx, 
that is, that Df is the weak derivative of /. □ 
Proof of Lemma A.l. (i) See [17]. 

(ii) Using the fact that (C (g) A) vecB = vec(ABC) and letting y := 
S~^/^(x — 0), the left-hand side of (ii) takes the form 

ckj, j {p;^^^/i^'(llylli^+Hj - /i/'(||y||) - \fl'\\\y\\) 

X (vecHsyvec(^V/i(l|y||)pj -Ife)}'t^y 

<C(Ti + T2 + T3), 
where Hs := 5]^^'''^H5]^^/2^ (j gome positive constant, 

/{ + Hsl^/'^ ^(vecHx;)'(vecIfc)| /i(||y||i;,+Hs) c^y, 

T2:= j ^[(vecHs)'(vecIfc)]2{A^/^||y||i,+H.)-/y'(||y||)}'rfy 
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and 

^3:= /{/y'(l|ylk,+H,)-/i/'(||y||) 

-^/y'(l|y||)(vecHs)'vec(^/,(||y||)^)py. 

Since (vec A)'(vecB) = tr(A'B) and |A + B|" = |A|" + a|A|"tr(A-iB) + 
o(||B||) for all a (see, e.g., [31], page 149), 

Ti = + H^l-V^ - 1 + ^(trHs) j' = o(||Hf ). 

Now, working in spherical coordinates (r, u) := (||y ||, y/||y||), we obtain 
= C// {/y'(r||u||i,+H,) - fl'\r) 

- \fi'^{r)i)f^{r)r[un^VL]fr^-^drda{xi) 
= C jj {/i/exp((lnr) + (In ||u||i,+h,)) - fl'l^{\nr) 

+ {fl'l^)'{\nr)[\v/Yi^u]fr'~Urda{n) 
= c||{AY4(. + (ln||u||i,+H,)) 

- /i;{xp(^) + (/i'fip)'(s)[iu'Hsu]}2e'=^dsda(u) 

where 

73a -=11 {/lYexp(« + (In ||u||l,+Hs)) 

-/iV4(^)-(/i;{xp)'(s)[ln||u||i,+H,]}'e''^d.da(u) 

and 

T,,:= ||{[ln||u||i,+Hj + [^uHsu]}^[(/iY4)'(sre'=^^i.da(u). 

By using Lemma A. 2 and the fact that In ||u||i^,+Hs = 0(||H||) for all u, 
we obtain that 

/ {/i/exp(« + (In l|u||i,+H J) - fl'l^is) - (/i£p)'(s)[ln ||u||i,+H J}'e^- ds 
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for all u. Therefore, from Lebesgue's dominated convergence theorem, it 
follows that = o(||H||2). As for T^b, we have that 

Tsb< sup {[ln||u||i,+Hs] + [iu'Hsu]f = o(||Hf) 

since [In ||u||ij.+h£] + [^u'Hsu] = o(||H||), uniformly for u G S'^'^ (see, e.g., [31], 
page 151). Consequently, T3 = o(||H|p), so T3 = o(l) as ||H|| goes to zero, 
and hence 

T2 < C||Hs||^|{/y^||y||i,+Hj - fl^\\\y\\}fdy 

<C|lHs||2/|^/i/'(||y|l)(vecHsyvec(v/,(l|y||)^)}'dy + o(||H||2), 

which shows that T2 = o(||H|p). This proves (ii). 

(iii) The left-hand side in (iii) is bounded by C{Si + ^2 + ||vechH|p53), 
where 

51 ■■= J {5'0+\s;/i (x) - 9l%-j^ (x) - t'iDeg^l.j^ (x))}^ dx, 

52 ■■= j {9l%+u;hi^) - 9l%-fS^) - (vechH)'(Ds5g/J.^^(x))}^dx 

and 

53 ■■= J ll^sfi'e+t.sj/ilx) - DY;gl%.j^{x)f dx 
= J \\Di:gl%.f^{x - t) - DY:gl%.j^{x)f dx. 

Now, from (i) and (ii), respectively. Si and 52 are o(||(t':(vechH)')'||^). As 
for S3, the quadratic mean continuity of x ^ Ds^^s ji (■^) ^ -^^(I^*^) implies 
that it is 0(1) as t — >■ 0. The result follows. □ 

Lemma A. 4. Let xi— Gr}{x) be dijferentiable in quadratic mean at ijq, 
with gradient x>-^ DGrfg{x), say. Let h be a diffeomorphism in a neighbor- 
hood 0/^0 •= ^"^(.Vo)- Then x 1— > G/i(^)(x) is differentiable in quadratic mean 
at ^Q, with gradient x>-^ {Dh^^y{DGfi(^^^-^{x)), where Dh^^ := (||^(Co)) de- 
notes the Jacobian matrix of h at 

Proof of Lemma A. 4. This is straightforward. □ 

1/2 

Applied to Lemma A.l(iii), the latter result implies that xh-> /^.^^(x) = 
f]!^2 vf (x) =S'fl'^2v f (x) is differentiable in quadratic mean, with gradi- 
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ent 



where 
W^.jM) := 



'1 (vechVy\ 1/2 / N I = oZijJiW^'?;/!^' 

|V-V2u(6l,V) 



1 / fT-2(vecK)' 



V 



|x-0| 



2 VMfc(V®2)-i/2 

/l|x-0||v 
X vec I -(/^/^ I 



(T 



-U(0,V)U'(0,V)-I; 



Checking Swensen's sufficient conditions for LAN is then a routine task. 
For example, letting vf"^ := (/j^^_i/2^(„).^^(Xi)//^/J^(Xi)) - 1 and z'f'^ := 

(l/2)(r("))'?i-i/2VF^.j^(X,), i = we have 



E 



An)^2 



ly- ' - Z. 

/ 'f/l?+n-l/2T-(");/i(^) 

- W - (l/2)(TW)'n-V2/Vj^(x)T^^^^^(x)}2dx 

= {/'i+.-l/2^(n)^;,(x) - /'4(X) - (n-V2^(n)y(^^l/2^(^))}2^^^ 

which is o(l) as n ^ oo. The other conditions easily follow. Now, the linear 
term in the second-order decomposition of the local log-likelihood ratio is 
2Er=i^i"^ = (T(''))'Aj."\t?), where A5."^(t9) is the central sequence given 
in (2.5). 

A.2. Proofs of Lemma 3.1 and Proposition 3.1. 

Proof of Lemma 3.1. Denote by Qfc(V) the matrix in the right-hand 
side of (3.3). Tedious but routine algebra yields 

(where is defined in Section 1.4). In order to prove the lemma, it is 
therefore sufficient to show that M'^NfcQfc(V) = Qfc(V). Now, it is easily 
seen that 

Qfe(V) = [1,2 - (vecV)(efc2 i)'][I,2 +Kfc](V®2)[i,, _ (vec V)(e,2 i)']'. 
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But, letting Ejj := e^e^- + e^e. [where (ei, . . . ,ej^) stands for the canonical 
basis of M'^], we have 

[Ife2-(vecV)(efc2,i)'][Ifc2+Kfc] 

= Ifc2 + Kfc-2(vecV)(efe2,i)' 

k 

= i (vecEij)(vecEij)' + 2(vec(eie;-V))(efc2,i)'. 

The result follows, since M^Nfc(vecW) = (vec W) for any symmetric k x k 
matrix W = (Wij) such that Wu = [recall that it is assumed that V = (Vij) 
is symmetric with Vn = 1]. 

Proof of Proposition 3.1. Under P^^^.f^, for any fixed t^q := 
(vechVo)'), we have 

Q^;) = (A;;")(i?o))'(r},(^o))-'Af )(^o) + op(l) 
as n ^ oo. The proof of the first statement in part (i) of Proposition 3.1 fol- 
lows, since A^|"^(i?o) is asymptotically 7Vfc(fc+i)/2_i(0, T}^ (i9o)) under P^^.j^. 
On the other hand, it is easy to see, still under P^"^.^^ , that A*["^ (i9o) and the 
local loff-likelihood ratio A^"^, ,„ ^ , where t' := (t', s, (vech v)'), are 

jointly multinormal, with asymptotic covariance (r}^ (i?o) ) (vec h v) . Le Cam's 
third lemma thus implies that Aj|"^(i9o) is asymptotically 

((r}^(^o))(ve°chv),r}^(^o)) 

under P^^_|_„-i/2t-.Jj i which establishes the second statement in part (i) of 
the proposition. 

As for part (ii), the fact that 0^"^ has asymptotic level a follows directly 
from the asymptotic null distribution given in part (i) and the classical 
Helly-Bray theorem, while local asymptotic maximinity is a consequence of 
the weak convergence to Gaussian shifts of local shape experiments (see, 
e.g.. Section 11.9 of [29]). □ 

A.3. Proofs of Propositions 3.2, 4.1 and Lemma 4.1. 

Proof of Proposition 3.2. Under P^^^^.^^, for any fixed i^'q := {0',a'^, 
(vec h Vq ) ' ) , we have 

Qin) ^ (A;W(^o))'(aii?.(<7i)T,-i(Vo))-^A;(;)(^o) +op(l) 
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as n ^ CX3, where T^"'^(Vo) was defined in (3.2). The result then follows — 
as in Proposition 3.1 — by proving that, under P^^_|_„-i/2^.g^ [with r' : = 
(t', s, (vech v)')], we have 

^A/-(afcE[^,,(Gr^(t.))(Gr'W)']Tfc-^Vo)(vechv),4i^fc(50T,THVo)) 

[also note that integration by parts yields E['ipg-^{G^^ {u)){Gi^ (u))^] = {k + 
2)Dk{gi)]. As for the optimality statement in part (ii) of the proposition, it is 
obtained as in the proof of Proposition 3.1 and by noting that alEk{(f>i)T'^^ (Vq) 

ni(^o). □ 

Proof of Lemma 4.1. Let 



TSi.:=n-/^J.Ei^(;^jvec(U.U:: 



and 



a 



Clearly, it is sufficient to prove that T — t|^)^.^^ goes to zero in quadratic 
mean, under P^^^ , as n ^ oo. For all £ = 1, 2, . . . , A;^, we have 



where, denoting by C/jj- the jth component of Uj, Ci^^, = Var[C/^]^] = 2(k — 
\)l{k^{k + 2)) for i € £fc := {mk + m + 1, m = 0, 1, . . . , A: - 1} and Q,fc = 
Var[C/i^ii7i^2] = 1/fc^ for i ^ Hajek's classical projection result for linear 
signed rank statistics ([15]; see also [44], Chapter 3) thus yields the desired 
result. □ 

Proof of Proposition 4.1. From Lemma 4.1, we easily obtain [for 
any fixed value i^g '■= {G',a'^, (vechVo)') of the parameter] 

QP = (A^;\(^o))'(E[K2(C/)]T,7i(Vo))-^A*(;\(^o) +op(l) 

as n ^ oo, under Uo-z Ugi{P0"]-2 Vo-gi )• ^^^^ (i) of Proposition 4.1 follows, 
since 
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as n^cx), under U^z Ugi{P0"^2^Vo;siJ'' ^^^^ •= (f, s, (vech v)'). Again, 
part (ii) follows — as in the proof of Proposition 3.1 — by noting that the 
asymptotic variance of A^"\^^(i9o) = Af^\'do) under U<t2{P0"^2 Vo;/i} 

jfc(/i)T,7nv) = r}^(^o). □ 

A.4. Proof of Proposition 5.1. (i) Letting 

Ti?':^.r"T'^(;^)vec(u.U;-il.), 

the necessary and sufficient consistency condition (5.2) holds iff E[P[|| T || > 

t|U(")]] ^ 1 under /C(/) asn^oo, for any t G M. Since P[|| tS^^|| > t|U(")] 
is a strictly bounded random variable, this is equivalent to 

(A.l) P[||Ti^^|| >t|U(")] = l + op(l), under /C(/), as 00. 

Now, conditional on U^"\ each component T^f^\ of T^^ is a linear rank 

statistic with approximate scores K{:^^^). Under the assumptions made, 
the Hajek variance inequality (Theorem 3.1 in [14]) applies (conditional on 

U^"^), yielding, for all i [with appropriate r and s, ni~^ := ;^ X^i^i -^(^qrr) 

and al := K\u) du - {J^ K{u) du)\ 

i=l 



Var(rX'£|U("0 < 21 max - -5rs - E " "^K 

~ ' i<i<n\ K J \n+lj 

(A.2) 



since maxi<j<„ \Ui^rUi^s - l^rsl < 1 and 



i=i ^ ' ^ j=l 



The bound (A.2) on the conditional variance being uniform, it follows that 
tP =tj.^^\lj(^))+Op{l), with 

;x(,")(U('^)):=E[tS^'^|U(")] 
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Consequently, the necessary and sufficient condition (A.l) takes the form 
/^^^^(U^"^) ^ oo [under /C(/), as n — > cxo], which concludes the proof of (i). 

(ii) Returning to /x^^^(U'^"^) and denoting by F'^^^^' the distribution func- 
tion, under IC^'^\f), of di = di{0, Ik) conditional on Uj = Uj(0, Ifc), we have 



n 



i=l 



i=l 



K 



n+1 



vec(U,U^--Ifc 





\k( ^ ) 


u(") 




\n+lj 





xvec(U,U^-il 



k 



" /-oo / " \ 



X vecf UiU- - -Ifc 



say. 



Clearly, 



As for E]^"\ Proposition 2 in [22] implies that for each component E^""* of 



A(^-EP[dj <di|di,Ui,U,]j U(") vec(^UiU',--] 



(n) 



Ej^"^ and appropriate r and s 

n 



E 


\k( ^ ) 











k 



E 


\k( ) 








V \n + lj 







< n 
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with J{K) < oo defined in (5.4) and C < 8 (cf. page 359 of [44]). Hence, 
^W(U(")) = E^") + Op(l) and/^P(U{"))^cx)iffE^")^cx); (5.5) follows. 

(iii) For each £, convexity of K implies 



E 



1 " 



Similarly, Jensen's inequality implies that 



n 



n r / , n \ 

i=l L \ j=l / 
1 " / ,1 

^ - E vecf U,„U^ - -Ifc 



-^/'^f(^Evec(u„U:„-il 



m=l 

n n 



^ ^ ^ E [P [dj < I d,, U„ U,- ] I U„ U, 



j=i j=i 



xvec(u,U^--Ifc 



It follows that E2.£ is bounded from above by 

1/2. 



n 



'(^^ E EE[i^(P[^^,<di|^^*,U„U,])|U„U,]vecfu,U:-il, 



-E 



vec(UiU'i-ilfc^^ 



k( (e vec(UiU;--Ifc 



X ;^7;^ E EE[/[rfj<rf.]|U.,U,]vec('u.U:-ilfc 



+ op(l), 
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1 



n(n — 1) 



J2 J2^[KiP[dj < di|di,Ui,U,])|Ui,U,]vec(UiU^ - -I 



l<i^ j<n 



and 



n(n 



are U-statistics with finite-variance kernels 

(u, v) ^ E[K(P[d2 < , Ui = u, U2 = v]) |Ui = u, U2 = v] 



X vec[ uu' — — Ifc 



1. 



and 



(u, v) ^ E[/[(i2 < di] |Ui = u, U2 = v] vec [vlu! - -1^ 

respectively. The continuous mapping theorem and standard asymptotic nor- 
mality results for U-statistics (see, e.g., [21]) imply that 



vec(UiU;-ilfc 



(A.3) X 



f E[K(P[rf 2 < di|di,Ui,U2])vec(UiU; - {l/k)Ik), 



E[vec(UiU; -(l/fc)I 



j^[ W[d2 < vec(UiU; - (l/fc)Ifc) 
I E[vec(UiU; - (l/fc)Ifc)7] 



+ Op(l). 



A sufficient condition for (5.5) to hold is thus that the quantity in braces 
in this upper bound be strictly negative, yielding part (5.6) of the claim. 
Similar arguments imply that 



e1"] > ni/2E 



vec(UiU;-ilfc'^ 



(A.4) X <^ K 



, /E[/[d2 <di]vec(UiU; - {l/k)Ik)-, 



V E[vec(UiU; - (l/A:)Ife)+] 
E[/^(P[d2 <di|di,Ui,U2])vec(UiUl - (lA)Ifc)7] 



E[vec(UiU; - (l/fc)Ifc)+ 
yielding part (5.7) of the claim. 



+ Op(l), 
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(iv) For Wilcoxon scores, the upper bound (A. 3) and the lower bound (A. 4) 
both reduce to 



n 



1/2 E 



I[d2<di]vec(Vi\J[-^Ik 



-E 



/[d2<di]vec(UiU;--Ifc 



/[d2 <di]vec(UiU; 



k 



+ Op(l) 
+ Op(l). 



Part (iv) of the proposition follows. 

(v) For the sign test, that is, when the score function K reduces to a 
constant, the necessary and sufficient condition (5.3) takes the form 



n , -, ^ 

(A.5) n~i/2^vecfu,U^--Ifej ^oo, under /C(")(/), as 



i=l 



n ■ 



oo. 



The central limit theorem implies that this happens iff the summands in 
(A.5) are incorrectly centered, that is, whenever (5.9) holds. This completes 
the proof of Proposition 5.1. □ 
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